Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
1.
Brief Bioinform ; 22(6)2021 11 05.
Artigo em Inglês | MEDLINE | ID: mdl-34382071

RESUMO

The goal of precision oncology is to tailor treatment for patients individually using the genomic profile of their tumors. Pharmacogenomics datasets such as cancer cell lines are among the most valuable resources for drug sensitivity prediction, a crucial task of precision oncology. Machine learning methods have been employed to predict drug sensitivity based on the multiple omics data available for large panels of cancer cell lines. However, there are no comprehensive guidelines on how to properly train and validate such machine learning models for drug sensitivity prediction. In this paper, we introduce a set of guidelines for different aspects of training gene expression-based predictors using cell line datasets. These guidelines provide extensive analysis of the generalization of drug sensitivity predictors and challenge many current practices in the community including the choice of training dataset and measure of drug sensitivity. The application of these guidelines in future studies will enable the development of more robust preclinical biomarkers.


Assuntos
Resistencia a Medicamentos Antineoplásicos , Aprendizado de Máquina , Farmacogenética , Algoritmos , Linhagem Celular Tumoral , Conjuntos de Dados como Assunto , Humanos
2.
Bioinformatics ; 36(Suppl_1): i380-i388, 2020 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-32657371

RESUMO

MOTIVATION: The goal of pharmacogenomics is to predict drug response in patients using their single- or multi-omics data. A major challenge is that clinical data (i.e. patients) with drug response outcome is very limited, creating a need for transfer learning to bridge the gap between large pre-clinical pharmacogenomics datasets (e.g. cancer cell lines), as a source domain, and clinical datasets as a target domain. Two major discrepancies exist between pre-clinical and clinical datasets: (i) in the input space, the gene expression data due to difference in the basic biology, and (ii) in the output space, the different measures of the drug response. Therefore, training a computational model on cell lines and testing it on patients violates the i.i.d assumption that train and test data are from the same distribution. RESULTS: We propose Adversarial Inductive Transfer Learning (AITL), a deep neural network method for addressing discrepancies in input and output space between the pre-clinical and clinical datasets. AITL takes gene expression of patients and cell lines as the input, employs adversarial domain adaptation and multi-task learning to address these discrepancies, and predicts the drug response as the output. To the best of our knowledge, AITL is the first adversarial inductive transfer learning method to address both input and output discrepancies. Experimental results indicate that AITL outperforms state-of-the-art pharmacogenomics and transfer learning baselines and may guide precision oncology more accurately. AVAILABILITY AND IMPLEMENTATION: https://github.com/hosseinshn/AITL. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Neoplasias , Farmacogenética , Humanos , Aprendizado de Máquina , Redes Neurais de Computação , Medicina de Precisão
3.
Bioinformatics ; 35(14): i501-i509, 2019 07 15.
Artigo em Inglês | MEDLINE | ID: mdl-31510700

RESUMO

MOTIVATION: Historically, gene expression has been shown to be the most informative data for drug response prediction. Recent evidence suggests that integrating additional omics can improve the prediction accuracy which raises the question of how to integrate the additional omics. Regardless of the integration strategy, clinical utility and translatability are crucial. Thus, we reasoned a multi-omics approach combined with clinical datasets would improve drug response prediction and clinical relevance. RESULTS: We propose MOLI, a multi-omics late integration method based on deep neural networks. MOLI takes somatic mutation, copy number aberration and gene expression data as input, and integrates them for drug response prediction. MOLI uses type-specific encoding sub-networks to learn features for each omics type, concatenates them into one representation and optimizes this representation via a combined cost function consisting of a triplet loss and a binary cross-entropy loss. The former makes the representations of responder samples more similar to each other and different from the non-responders, and the latter makes this representation predictive of the response values. We validate MOLI on in vitro and in vivo datasets for five chemotherapy agents and two targeted therapeutics. Compared to state-of-the-art single-omics and early integration multi-omics methods, MOLI achieves higher prediction accuracy in external validations. Moreover, a significant improvement in MOLI's performance is observed for targeted drugs when training on a pan-drug input, i.e. using all the drugs with the same target compared to training only on drug-specific inputs. MOLI's high predictive power suggests it may have utility in precision oncology. AVAILABILITY AND IMPLEMENTATION: https://github.com/hosseinshn/MOLI. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Antineoplásicos , Neoplasias , Redes Neurais de Computação , Algoritmos , Previsões , Humanos , Neoplasias/tratamento farmacológico , Preparações Farmacêuticas , Medicina de Precisão
4.
Genomics ; 107(2-3): 83-87, 2016 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-26762945

RESUMO

One of the central challenges in cancer research is identifying significant genes among thousands of others on a microarray. Since preventing outbreak and progression of cancer is the ultimate goal in bioinformatics and computational biology, detection of genes that are most involved is vital and crucial. In this article, we propose a Maximum-Minimum Correntropy Criterion (MMCC) approach for selection of informative genes from microarray data sets which is stable, fast and robust against diverse noise and outliers and competitively accurate in comparison with other algorithms. Moreover, via an evolutionary optimization process, the optimal number of features for each data set is determined. Through broad experimental evaluation, MMCC is proved to be significantly better compared to other well-known gene selection algorithms for 25 commonly used microarray data sets. Surprisingly, high accuracy in classification by Support Vector Machine (SVM) is achieved by less than 10 genes selected by MMCC in all of the cases.


Assuntos
Biologia Computacional/métodos , Perfilação da Expressão Gênica/métodos , Neoplasias/genética , Algoritmos , Predisposição Genética para Doença , Humanos , Reconhecimento Automatizado de Padrão , Máquina de Vetores de Suporte
5.
IET Syst Biol ; 10(6): 229-236, 2016 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-27879477

RESUMO

One of the most important needs in the post-genome era is providing the researchers with reliable and efficient computational tools to extract and analyse this huge amount of biological data, in which DNA copy number variation (CNV) is a vitally important one. Array-based comparative genomic hybridisation (aCGH) is a common approach in order to detect CNVs. Most of methods for this purpose were proposed for one-dimensional profiles. However, slightly this focus has moved from one- to multi-dimensional signals. In addition, since contamination of these profiles with noise is always an issue, it is highly important to have a robust method for analysing multi-sample aCGH profiles. In this study, the authors propose robust group fused lasso which utilises the robust group total variations. Instead of l2,1 norm, the l1 - l2 M-estimator is used which is more robust in dealing with non-Gaussian noise and high corruption. More importantly, Correntropy (Welsch M-estimator) is also applied for fitting error. Extensive experiments indicate that the proposed method outperforms the state-of-the art algorithms and techniques under a wide range of scenarios with diverse noises.


Assuntos
Hibridização Genômica Comparativa , Variações do Número de Cópias de DNA , Algoritmos , Neoplasias da Mama/metabolismo , Biologia Computacional , Bases de Dados Genéticas , Feminino , Genoma Humano , Humanos , Modelos Estatísticos , Distribuição Normal , Polimorfismo de Nucleotídeo Único , Curva ROC , Razão Sinal-Ruído , Software , Incerteza
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA