Pesquisa | Portal de Pesquisa da BVS

Drug Target Identification with Machine Learning: How to Choose Negative Examples.

Najm, Matthieu; Azencott, Chloé-Agathe; Playe, Benoit; Stoven, Véronique.

Int J Mol Sci ; 22(10)2021 May 12.

Artigo em Inglês | MEDLINE | ID: mdl-34066072

RESUMO

Identification of the protein targets of hit molecules is essential in the drug discovery process. Target prediction with machine learning algorithms can help accelerate this search, limiting the number of required experiments. However, Drug-Target Interactions databases used for training present high statistical bias, leading to a high number of false positives, thus increasing time and cost of experimental validation campaigns. To minimize the number of false positives among predicted targets, we propose a new scheme for choosing negative examples, so that each protein and each drug appears an equal number of times in positive and negative examples. We artificially reproduce the process of target identification for three specific drugs, and more globally for 200 approved drugs. For the detailed three drug examples, and for the larger set of 200 drugs, training with the proposed scheme for the choice of negative examples improved target prediction results: the average number of false positives among the top ranked predicted targets decreased, and overall, the rank of the true targets was improved.Our method corrects databases' statistical bias and reduces the number of false positive predictions, and therefore the number of useless experiments potentially undertaken.

Assuntos

Biologia Computacional/métodos , Descoberta de Drogas/métodos , Aprendizado de Máquina , Preparações Farmacêuticas/química , Proteínas/química , Software , Humanos , Preparações Farmacêuticas/metabolismo , Mapeamento de Interação de Proteínas , Proteínas/metabolismo , Máquina de Vetores de Suporte

Evaluation of deep and shallow learning methods in chemogenomics for the prediction of drugs specificity.

Playe, Benoit; Stoven, Veronique.

J Cheminform ; 12(1): 11, 2020 Feb 10.

Artigo em Inglês | MEDLINE | ID: mdl-33431042

RESUMO

Chemogenomics, also called proteochemometrics, covers a range of computational methods that can be used to predict protein-ligand interactions at large scales in the protein and chemical spaces. They differ from more classical ligand-based methods (also called QSAR) that predict ligands for a given protein receptor. In the context of drug discovery process, chemogenomics allows to tackle the question of predicting off-target proteins for drug candidates, one of the main causes of undesirable side-effects and failure within drugs development processes. The present study compares shallow and deep machine-learning approaches for chemogenomics, and explores data augmentation techniques for deep learning algorithms in chemogenomics. Shallow machine-learning algorithms rely on expert-based chemical and protein descriptors, while recent developments in deep learning algorithms enable to learn abstract numerical representations of molecular graphs and protein sequences, in order to optimise the performance of the prediction task. We first propose a formulation of chemogenomics with deep learning, called the chemogenomic neural network (CN), as a feed-forward neural network taking as input the combination of molecule and protein representations learnt by molecular graph and protein sequence encoders. We show that, on large datasets, the deep learning CN model outperforms state-of-the-art shallow methods, and competes with deep methods with expert-based descriptors. However, on small datasets, shallow methods present better prediction performance than deep learning methods. Then, we evaluate data augmentation techniques, namely multi-view and transfer learning, to improve the prediction performance of the chemogenomic neural network. We conclude that a promising research direction is to integrate heterogeneous sources of data such as auxiliary tasks for which large datasets are available, or independently, multiple molecule and protein attribute views.

Efficient multi-task chemogenomics for drug specificity prediction.

Playe, Benoit; Azencott, Chloé-Agathe; Stoven, Véronique.

PLoS One ; 13(10): e0204999, 2018.

Artigo em Inglês | MEDLINE | ID: mdl-30286165

RESUMO

Adverse drug reactions, also called side effects, range from mild to fatal clinical events and significantly affect the quality of care. Among other causes, side effects occur when drugs bind to proteins other than their intended target. As experimentally testing drug specificity against the entire proteome is out of reach, we investigate the application of chemogenomics approaches. We formulate the study of drug specificity as a problem of predicting interactions between drugs and proteins at the proteome scale. We build several benchmark datasets, and propose NN-MT, a multi-task Support Vector Machine (SVM) algorithm that is trained on a limited number of data points, in order to solve the computational issues or proteome-wide SVM for chemogenomics. We compare NN-MT to different state-of-the-art methods, and show that its prediction performances are similar or better, at an efficient calculation cost. Compared to its competitors, the proposed method is particularly efficient to predict (protein, ligand) interactions in the difficult double-orphan case, i.e. when no interactions are previously known for the protein nor for the ligand. The NN-MT algorithm appears to be a good default method providing state-of-the-art or better performances, in a wide range of prediction scenario that are considered in the present study: proteome-wide prediction, protein family prediction, test (protein, ligand) pairs dissimilar to pairs in the train set, and orphan cases.

Assuntos

Genômica , Preparações Farmacêuticas , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos/diagnóstico , Preparações Farmacêuticas/metabolismo , Prognóstico , Máquina de Vetores de Suporte

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA