Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 32
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
Int J Mol Sci ; 23(16)2022 Aug 11.
Artículo en Inglés | MEDLINE | ID: mdl-36012204

RESUMEN

Proteins interacting with CFTR and its mutants have been intensively studied using different experimental approaches. These studies provided information on the cellular processes leading to proper protein folding, routing to the plasma membrane, recycling, activation and degradation. Recently, new approaches have been developed based on the proximity labeling of protein partners or proteins in close vicinity and their subsequent identification by mass spectrometry. In this study, we evaluated TurboID- and APEX2-based proximity labeling of WT CFTR and compared the obtained data to those reported in databases. The CFTR-WT interactome was then compared to that of two CFTR (G551D and W1282X) mutants and the structurally unrelated potassium channel KCNK3. The two proximity labeling approaches identified both known and additional CFTR protein partners, including multiple SLC transporters. Proximity labeling approaches provided a more comprehensive picture of the CFTR interactome and improved our knowledge of the CFTR environment.


Asunto(s)
Regulador de Conductancia de Transmembrana de Fibrosis Quística , Pliegue de Proteína , Membrana Celular/metabolismo , Regulador de Conductancia de Transmembrana de Fibrosis Quística/genética , Regulador de Conductancia de Transmembrana de Fibrosis Quística/metabolismo , Espectrometría de Masas , Mutación
2.
Int J Mol Sci ; 22(10)2021 May 12.
Artículo en Inglés | MEDLINE | ID: mdl-34066072

RESUMEN

Identification of the protein targets of hit molecules is essential in the drug discovery process. Target prediction with machine learning algorithms can help accelerate this search, limiting the number of required experiments. However, Drug-Target Interactions databases used for training present high statistical bias, leading to a high number of false positives, thus increasing time and cost of experimental validation campaigns. To minimize the number of false positives among predicted targets, we propose a new scheme for choosing negative examples, so that each protein and each drug appears an equal number of times in positive and negative examples. We artificially reproduce the process of target identification for three specific drugs, and more globally for 200 approved drugs. For the detailed three drug examples, and for the larger set of 200 drugs, training with the proposed scheme for the choice of negative examples improved target prediction results: the average number of false positives among the top ranked predicted targets decreased, and overall, the rank of the true targets was improved.Our method corrects databases' statistical bias and reduces the number of false positive predictions, and therefore the number of useless experiments potentially undertaken.


Asunto(s)
Biología Computacional/métodos , Descubrimiento de Drogas/métodos , Aprendizaje Automático , Preparaciones Farmacéuticas/química , Proteínas/química , Programas Informáticos , Humanos , Preparaciones Farmacéuticas/metabolismo , Mapeo de Interacción de Proteínas , Proteínas/metabolismo , Máquina de Vectores de Soporte
3.
PLoS Comput Biol ; 15(9): e1007381, 2019 09.
Artículo en Inglés | MEDLINE | ID: mdl-31568528

RESUMEN

Cancer driver genes, i.e., oncogenes and tumor suppressor genes, are involved in the acquisition of important functions in tumors, providing a selective growth advantage, allowing uncontrolled proliferation and avoiding apoptosis. It is therefore important to identify these driver genes, both for the fundamental understanding of cancer and to help finding new therapeutic targets or biomarkers. Although the most frequently mutated driver genes have been identified, it is believed that many more remain to be discovered, particularly for driver genes specific to some cancer types. In this paper, we propose a new computational method called LOTUS to predict new driver genes. LOTUS is a machine-learning based approach which allows to integrate various types of data in a versatile manner, including information about gene mutations and protein-protein interactions. In addition, LOTUS can predict cancer driver genes in a pan-cancer setting as well as for specific cancer types, using a multitask learning strategy to share information across cancer types. We empirically show that LOTUS outperforms five other state-of-the-art driver gene prediction methods, both in terms of intrinsic consistency and prediction accuracy, and provide predictions of new cancer genes across many cancer types.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Aprendizaje Automático , Neoplasias/genética , Oncogenes/genética , Programas Informáticos , Humanos , Modelos Estadísticos
4.
Int J Mol Sci ; 21(18)2020 Sep 10.
Artículo en Inglés | MEDLINE | ID: mdl-32927759

RESUMEN

Background: The prevalence of chronic kidney disease is increased in patients with cystic fibrosis (CF). The study of urinary exosomal proteins might provide insight into the pathophysiology of CF kidney disease. Methods: Urine samples were collected from 19 CF patients (among those 7 were treated by cystic fibrosis transmembrane conductance regulator (CFTR) modulators), and 8 healthy subjects. Urine exosomal protein content was determined by high resolution mass spectrometry. Results: A heatmap of the differentially expressed proteins in urinary exosomes showed a clear separation between control and CF patients. Seventeen proteins were upregulated in CF patients (including epidermal growth factor receptor (EGFR); proteasome subunit beta type-6, transglutaminases, caspase 14) and 118 were downregulated (including glutathione S-transferases, superoxide dismutase, klotho, endosomal sorting complex required for transport, and matrisome proteins). Gene set enrichment analysis revealed 20 gene sets upregulated and 74 downregulated. Treatment with CFTR modulators yielded no significant modification of the proteomic content. These results highlight that CF kidney cells adapt to the CFTR defect by upregulating proteasome activity and that autophagy and endosomal targeting are impaired. Increased expression of EGFR and decreased expression of klotho and matrisome might play a central role in this CF kidney signature by inducing oxidation, inflammation, accelerated senescence, and abnormal tissue repair. Conclusions: Our study unravels novel insights into consequences of CFTR dysfunction in the urinary tract, some of which may have clinical and therapeutic implications.


Asunto(s)
Fibrosis Quística/orina , Exosomas/metabolismo , Enfermedades Renales/orina , Adolescente , Adulto , Aminofenoles/uso terapéutico , Aminopiridinas/uso terapéutico , Benzodioxoles/uso terapéutico , Estudios de Casos y Controles , Niño , Preescolar , Fibrosis Quística/complicaciones , Fibrosis Quística/tratamiento farmacológico , Combinación de Medicamentos , Humanos , Indoles/uso terapéutico , Enfermedades Renales/etiología , Proteoma , Quinolonas/uso terapéutico , Adulto Joven
5.
Bioorg Med Chem ; 26(20): 5510-5530, 2018 11 01.
Artículo en Inglés | MEDLINE | ID: mdl-30309671

RESUMEN

The TAM kinase family arises as a new effective and attractive therapeutic target for cancer therapy, autoimmune and viral diseases. A series of 2,6-disubstituted imidazo[4,5-b]pyridines were designed, synthesized and identified as highly potent TAM inhibitors. Despite remarkable structural similarities within the TAM family, compounds 28 and 25 demonstrated high activity and selectivity in vitro against AXL and MER, with IC50 value of 0.77 nM and 9 nM respectively and a 120- to 900-fold selectivity. We also observed an unexpected nuclear localization for compound 10Bb, thanks to nanoSIMS technology, which could be correlated to the absence of cytotoxicity on three different cancer cell lines being sensitive to TAM inhibition.


Asunto(s)
Imidazoles/química , Imidazoles/farmacología , Inhibidores de Proteínas Quinasas/química , Inhibidores de Proteínas Quinasas/farmacología , Proteínas Proto-Oncogénicas/antagonistas & inhibidores , Piridinas/química , Piridinas/farmacología , Proteínas Tirosina Quinasas Receptoras/antagonistas & inhibidores , Tirosina Quinasa c-Mer/antagonistas & inhibidores , Células A549 , Diseño de Fármacos , Humanos , Imidazoles/síntesis química , Imidazoles/farmacocinética , Modelos Moleculares , Inhibidores de Proteínas Quinasas/síntesis química , Inhibidores de Proteínas Quinasas/farmacocinética , Proteínas Proto-Oncogénicas/metabolismo , Piridinas/síntesis química , Piridinas/farmacocinética , Proteínas Tirosina Quinasas Receptoras/metabolismo , Relación Estructura-Actividad , Tirosina Quinasa c-Mer/metabolismo , Tirosina Quinasa del Receptor Axl
6.
J Biol Chem ; 288(25): 18561-73, 2013 Jun 21.
Artículo en Inglés | MEDLINE | ID: mdl-23653352

RESUMEN

Widespread drug resistance calls for the urgent development of new antimalarials that target novel steps in the life cycle of Plasmodium falciparum and Plasmodium vivax. The essential subtilisin-like serine protease SUB1 of Plasmodium merozoites plays a dual role in egress from and invasion into host erythrocytes. It belongs to a new generation of attractive drug targets against which specific potent inhibitors are actively searched. We characterize here the P. vivax SUB1 enzyme and show that it displays a typical auto-processing pattern and apical localization in P. vivax merozoites. To search for small PvSUB1 inhibitors, we took advantage of the similarity of SUB1 with bacterial subtilisins and generated P. vivax SUB1 three-dimensional models. The structure-based virtual screening of a large commercial chemical compounds library identified 306 virtual best hits, of which 37 were experimentally confirmed inhibitors and 5 had Ki values of <50 µM for PvSUB1. Interestingly, they belong to different chemical families. The most promising competitive inhibitor of PvSUB1 (compound 2) was equally active on PfSUB1 and displayed anti-P. falciparum and Plasmodium berghei activity in vitro and in vivo, respectively. Compound 2 inhibited the endogenous PfSUB1 as illustrated by the inhibited maturation of its natural substrate PfSERA5 and inhibited parasite egress and subsequent erythrocyte invasion. These data indicate that the strategy of in silico screening of three-dimensional models to select for virtual inhibitors combined with stringent biological validation successfully identified several inhibitors of the PvSUB1 enzyme. The most promising hit proved to be a potent cross-inhibitor of PlasmodiumSUB1, laying the groundwork for the development of a globally active small compound antimalarial.


Asunto(s)
Plasmodium vivax/enzimología , Estructura Terciaria de Proteína , Proteínas Protozoarias/química , Serina Proteasas/química , Secuencia de Aminoácidos , Animales , Antimaláricos/química , Antimaláricos/farmacología , Sitios de Unión/genética , Biocatálisis/efectos de los fármacos , Relación Dosis-Respuesta a Droga , Eritrocitos/efectos de los fármacos , Eritrocitos/parasitología , Femenino , Cinética , Malaria/parasitología , Malaria/prevención & control , Merozoítos/efectos de los fármacos , Merozoítos/enzimología , Ratones , Modelos Moleculares , Datos de Secuencia Molecular , Estructura Molecular , Plasmodium berghei/efectos de los fármacos , Plasmodium berghei/enzimología , Plasmodium vivax/efectos de los fármacos , Plasmodium vivax/genética , Proteínas Protozoarias/genética , Proteínas Protozoarias/metabolismo , Homología de Secuencia de Aminoácido , Serina Proteasas/genética , Serina Proteasas/metabolismo , Inhibidores de Serina Proteinasa/química , Inhibidores de Serina Proteinasa/farmacología , Células Sf9 , Especificidad por Sustrato
7.
NPJ Syst Biol Appl ; 10(1): 8, 2024 Jan 19.
Artículo en Inglés | MEDLINE | ID: mdl-38242871

RESUMEN

The efficiency of analyzing high-throughput data in systems biology has been demonstrated in numerous studies, where molecular data, such as transcriptomics and proteomics, offers great opportunities for understanding the complexity of biological processes. One important aspect of data analysis in systems biology is the shift from a reductionist approach that focuses on individual components to a more integrative perspective that considers the system as a whole, where the emphasis shifted from differential expression of individual genes to determining the activity of gene sets. Here, we present the rROMA software package for fast and accurate computation of the activity of gene sets with coordinated expression. The rROMA package incorporates significant improvements in the calculation algorithm, along with the implementation of several functions for statistical analysis and visualizing results. These additions greatly expand the package's capabilities and offer valuable tools for data analysis and interpretation. It is an open-source package available on github at: www.github.com/sysbio-curie/rROMA . Based on publicly available transcriptomic datasets, we applied rROMA to cystic fibrosis, highlighting biological mechanisms potentially involved in the establishment and progression of the disease and the associated genes. Results indicate that rROMA can detect disease-related active signaling pathways using transcriptomic and proteomic data. The results notably identified a significant mechanism relevant to cystic fibrosis, raised awareness of a possible bias related to cell culture, and uncovered an intriguing gene that warrants further investigation.


Asunto(s)
Fibrosis Quística , Proteómica , Humanos , Proteómica/métodos , Perfilación de la Expresión Génica/métodos , Transcriptoma/genética , Biología de Sistemas/métodos
8.
Bioinformatics ; 28(18): i487-i494, 2012 Sep 15.
Artículo en Inglés | MEDLINE | ID: mdl-22962471

RESUMEN

MOTIVATION: Drug effects are mainly caused by the interactions between drug molecules and their target proteins including primary targets and off-targets. Identification of the molecular mechanisms behind overall drug-target interactions is crucial in the drug design process. RESULTS: We develop a classifier-based approach to identify chemogenomic features (the underlying associations between drug chemical substructures and protein domains) that are involved in drug-target interaction networks. We propose a novel algorithm for extracting informative chemogenomic features by using L(1) regularized classifiers over the tensor product space of possible drug-target pairs. It is shown that the proposed method can extract a very limited number of chemogenomic features without loosing the performance of predicting drug-target interactions and the extracted features are biologically meaningful. The extracted substructure-domain association network enables us to suggest ligand chemical fragments specific for each protein domain and ligand core substructures important for a wide range of protein families. AVAILABILITY: Softwares are available at the supplemental website. CONTACT: yamanishi@bioreg.kyushu-u.ac.jp SUPPLEMENTARY INFORMATION: Datasets and all results are available at http://cbio.ensmp.fr/~yyamanishi/l1binary/ .


Asunto(s)
Algoritmos , Diseño de Fármacos , Preparaciones Farmacéuticas/química , Estructura Terciaria de Proteína , Sistemas de Liberación de Medicamentos , Humanos , Ligandos , Modelos Lineales , Proteínas/química , Proteínas/clasificación , Proteínas/metabolismo
9.
Bioinformatics ; 28(18): i522-i528, 2012 Sep 15.
Artículo en Inglés | MEDLINE | ID: mdl-22962476

RESUMEN

MOTIVATION: Identifying the emergence and underlying mechanisms of drug side effects is a challenging task in the drug development process. This underscores the importance of system-wide approaches for linking different scales of drug actions; namely drug-protein interactions (molecular scale) and side effects (phenotypic scale) toward side effect prediction for uncharacterized drugs. RESULTS: We performed a large-scale analysis to extract correlated sets of targeted proteins and side effects, based on the co-occurrence of drugs in protein-binding profiles and side effect profiles, using sparse canonical correlation analysis. The analysis of 658 drugs with the two profiles for 1368 proteins and 1339 side effects led to the extraction of 80 correlated sets. Enrichment analyses using KEGG and Gene Ontology showed that most of the correlated sets were significantly enriched with proteins that are involved in the same biological pathways, even if their molecular functions are different. This allowed for a biologically relevant interpretation regarding the relationship between drug-targeted proteins and side effects. The extracted side effects can be regarded as possible phenotypic outcomes by drugs targeting the proteins that appear in the same correlated set. The proposed method is expected to be useful for predicting potential side effects of new drug candidate compounds based on their protein-binding profiles. SUPPLEMENTARY INFORMATION: Datasets and all results are available at http://web.kuicr.kyoto-u.ac.jp/supp/smizutan/target-effect/. AVAILABILITY: Software is available at the above supplementary website. CONTACT: yamanishi@bioreg.kyushu-u.ac.jp, or goto@kuicr.kyoto-u.ac.jp.


Asunto(s)
Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos , Modelos Estadísticos , Preparaciones Farmacéuticas/química , Proteínas/efectos de los fármacos , Preparaciones Farmacéuticas/metabolismo , Fenotipo , Proteínas/metabolismo
10.
Mol Inform ; 42(4): e2200216, 2023 04.
Artículo en Inglés | MEDLINE | ID: mdl-36633361

RESUMEN

Identification of novel chemotypes with biological activity similar to a known active molecule is an important challenge in drug discovery called 'scaffold hopping'. Small-, medium-, and large-step scaffold hopping efforts may lead to increasing degrees of chemical structure novelty with respect to the parent compound. In the present paper, we focus on the problem of large-step scaffold hopping. We assembled a high quality and well characterized dataset of scaffold hopping examples comprising pairs of active molecules and including a variety of protein targets. This dataset was used to build a benchmark corresponding to the setting of real-life applications: one active molecule is known, and the second active is searched among a set of decoys chosen in a way to avoid statistical bias. This allowed us to evaluate the performance of computational methods for solving large-step scaffold hopping problems. In particular, we assessed how difficult these problems are, particularly for classical 2D and 3D ligand-based methods. We also showed that a machine-learning chemogenomic algorithm outperforms classical methods and we provided some useful hints for future improvements.


Asunto(s)
Benchmarking , Descubrimiento de Drogas , Descubrimiento de Drogas/métodos , Ligandos , Algoritmos , Aprendizaje Automático
11.
Cancers (Basel) ; 15(19)2023 Oct 06.
Artículo en Inglés | MEDLINE | ID: mdl-37835564

RESUMEN

A wide panel of microtubule-associated proteins and kinases is involved in coordinated regulation of the microtubule cytoskeleton and may thus represent valuable molecular markers contributing to major cellular pathways deregulated in cancer. We previously identified a panel of 17 microtubule-related (MT-Rel) genes that are differentially expressed in breast tumors showing resistance to taxane-based chemotherapy. In the present study, we evaluated the expression, prognostic value and functional impact of these genes in breast cancer. We show that 14 MT-Rel genes (KIF4A, ASPM, KIF20A, KIF14, TPX2, KIF18B, KIFC1, AURKB, KIF2C, GTSE1, KIF15, KIF11, RACGAP1, STMN1) are up-regulated in breast tumors compared with adjacent normal tissue. Six of them (KIF4A, ASPM, KIF20A, KIF14, TPX2, KIF18B) are overexpressed by more than 10-fold in tumor samples and four of them (KIF11, AURKB, TPX2 and KIFC1) are essential for cell survival. Overexpression of all 14 genes, and underexpression of 3 other MT-Rel genes (MAST4, MAPT and MTUS1) are associated with poor breast cancer patient survival. A Systems Biology approach highlighted three major functional networks connecting the 17 MT-Rel genes and their partners, which are centered on spindle assembly, chromosome segregation and cytokinesis. Our studies identified mitotic Aurora kinases and their substrates as major targets for therapeutic approaches against breast cancer.

12.
Pediatr Pulmonol ; 57(12): 2992-2999, 2022 12.
Artículo en Inglés | MEDLINE | ID: mdl-35996214

RESUMEN

INTRODUCTION: Clinical trials for CFTR modulators consider mean changes of clinical status at the cohort level, and thus fail to assess the heterogeneity of the response. We aimed to study the different response profiles to lumacaftor-ivacaftor according to age in children with cystic fibrosis (CF). METHODS: A mathematical framework, including principal component analysis, data clustering, and data completion, was applied to a multicenter cohort of 112 children aged 6-18 years, treated with lumacaftor-ivacaftor. Studied parameters at baseline and 6 months included body mass index (BMI), number of days of antibiotics (ATB), Sweat test (ST), forced expiratory volume in 1 s expressed in percentage predicted (ppFEV1 ), forced vital capacity (ppFVC), and forced expiratory flow at 25%-75% of FVC (ppFEF25-75 ). RESULTS: Change in ppFEV1 was the most significant parameter in characterizing response heterogeneity among the 12-18-year-old patients. Patients with minimal changes in ppFEV1 were further separated by change in BMI and ATB course. In the 6-12-year-old children both BMI and ppFEV1 evolution were the most relevant. ST change was not associated with a clinical response. CONCLUSIONS: Change in ppFEV1 , BMI, and ATB course are the most relevant outcomes to discriminate clinical response profiles in children treated with lumacaftor-ivacaftor. Prepubertal and pubertal children display different response profiles.


Asunto(s)
Regulador de Conductancia de Transmembrana de Fibrosis Quística , Fibrosis Quística , Niño , Humanos , Adolescente , Regulador de Conductancia de Transmembrana de Fibrosis Quística/uso terapéutico , Aminofenoles/uso terapéutico , Aminofenoles/farmacología , Benzodioxoles/uso terapéutico , Benzodioxoles/farmacología , Aminopiridinas/uso terapéutico , Aminopiridinas/farmacología , Fibrosis Quística/tratamiento farmacológico , Fibrosis Quística/genética , Fibrosis Quística/complicaciones , Volumen Espiratorio Forzado , Combinación de Medicamentos , Antibacterianos/uso terapéutico , Fibrosis , Mutación
13.
BMC Bioinformatics ; 12: 169, 2011 May 18.
Artículo en Inglés | MEDLINE | ID: mdl-21586169

RESUMEN

BACKGROUND: Drug side-effects, or adverse drug reactions, have become a major public health concern. It is one of the main causes of failure in the process of drug development, and of drug withdrawal once they have reached the market. Therefore, in silico prediction of potential side-effects early in the drug discovery process, before reaching the clinical stages, is of great interest to improve this long and expensive process and to provide new efficient and safe therapies for patients. RESULTS: In the present work, we propose a new method to predict potential side-effects of drug candidate molecules based on their chemical structures, applicable on large molecular databanks. A unique feature of the proposed method is its ability to extract correlated sets of chemical substructures (or chemical fragments) and side-effects. This is made possible using sparse canonical correlation analysis (SCCA). In the results, we show the usefulness of the proposed method by predicting 1385 side-effects in the SIDER database from the chemical structures of 888 approved drugs. These predictions are performed with simultaneous extraction of correlated ensembles formed by a set of chemical substructures shared by drugs that are likely to have a set of side-effects. We also conduct a comprehensive side-effect prediction for many uncharacterized drug molecules stored in DrugBank, and were able to confirm interesting predictions using independent source of information. CONCLUSIONS: The proposed method is expected to be useful in various stages of the drug development process.


Asunto(s)
Biología Computacional/métodos , Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos , Biología Computacional/economía , Bases de Datos Factuales , Sistemas de Liberación de Medicamentos , Diseño de Fármacos , Humanos , Preparaciones Farmacéuticas/química
14.
J Chem Inf Model ; 51(5): 1183-94, 2011 May 23.
Artículo en Inglés | MEDLINE | ID: mdl-21506615

RESUMEN

The identification of rules governing molecular recognition between drug chemical substructures and protein functional sites is a challenging issue at many stages of the drug development process. In this paper we develop a novel method to extract sets of drug chemical substructures and protein domains that govern drug-target interactions on a genome-wide scale. This is made possible using sparse canonical correspondence analysis (SCCA) for analyzing drug substructure profiles and protein domain profiles simultaneously. The method does not depend on the availability of protein 3D structures. From a data set of known drug-target interactions including enzymes, ion channels, G protein-coupled receptors, and nuclear receptors, we extract a set of chemical substructures shared by drugs able to bind to a set of protein domains. These two sets of extracted chemical substructures and protein domains form components that can be further exploited in a drug discovery process. This approach successfully clusters protein domains that may be evolutionary unrelated but that bind a common set of chemical substructures. As shown in several examples, it can also be very helpful for predicting new protein-ligand interactions and addressing the problem of ligand specificity. The proposed method constitutes a contribution to the recent field of chemogenomics that aims to connect the chemical space with the biological space.


Asunto(s)
Diseño de Fármacos , Enzimas/química , Canales Iónicos/química , Receptores Citoplasmáticos y Nucleares/química , Receptores Acoplados a Proteínas G/química , Algoritmos , Sitios de Unión , Minería de Datos , Descubrimiento de Drogas , Ligandos , Unión Proteica , Dominios y Motivos de Interacción de Proteínas
15.
BMC Bioinformatics ; 11: 99, 2010 Feb 22.
Artículo en Inglés | MEDLINE | ID: mdl-20175916

RESUMEN

BACKGROUND: Predicting which molecules can bind to a given binding site of a protein with known 3D structure is important to decipher the protein function, and useful in drug design. A classical assumption in structural biology is that proteins with similar 3D structures have related molecular functions, and therefore may bind similar ligands. However, proteins that do not display any overall sequence or structure similarity may also bind similar ligands if they contain similar binding sites. Quantitatively assessing the similarity between binding sites may therefore be useful to propose new ligands for a given pocket, based on those known for similar pockets. RESULTS: We propose a new method to quantify the similarity between binding pockets, and explore its relevance for ligand prediction. We represent each pocket by a cloud of atoms, and assess the similarity between two pockets by aligning their atoms in the 3D space and comparing the resulting configurations with a convolution kernel. Pocket alignment and comparison is possible even when the corresponding proteins share no sequence or overall structure similarities. In order to predict ligands for a given target pocket, we compare it to an ensemble of pockets with known ligands to identify the most similar pockets. We discuss two criteria to evaluate the performance of a binding pocket similarity measure in the context of ligand prediction, namely, area under ROC curve (AUC scores) and classification based scores. We show that the latter is better suited to evaluate the methods with respect to ligand prediction, and demonstrate the relevance of our new binding site similarity compared to existing similarity measures. CONCLUSIONS: This study demonstrates the relevance of the proposed method to identify ligands binding to known binding pockets. We also provide a new benchmark for future work in this field. The new method and the benchmark are available at http://cbio.ensmp.fr/paris/.


Asunto(s)
Biología Computacional/métodos , Proteínas/química , Proteínas/metabolismo , Sitios de Unión , Bases de Datos de Proteínas , Ligandos , Modelos Moleculares , Conformación Proteica , Relación Estructura-Actividad
16.
J Cheminform ; 12(1): 11, 2020 Feb 10.
Artículo en Inglés | MEDLINE | ID: mdl-33431042

RESUMEN

Chemogenomics, also called proteochemometrics, covers a range of computational methods that can be used to predict protein-ligand interactions at large scales in the protein and chemical spaces. They differ from more classical ligand-based methods (also called QSAR) that predict ligands for a given protein receptor. In the context of drug discovery process, chemogenomics allows to tackle the question of predicting off-target proteins for drug candidates, one of the main causes of undesirable side-effects and failure within drugs development processes. The present study compares shallow and deep machine-learning approaches for chemogenomics, and explores data augmentation techniques for deep learning algorithms in chemogenomics. Shallow machine-learning algorithms rely on expert-based chemical and protein descriptors, while recent developments in deep learning algorithms enable to learn abstract numerical representations of molecular graphs and protein sequences, in order to optimise the performance of the prediction task. We first propose a formulation of chemogenomics with deep learning, called the chemogenomic neural network (CN), as a feed-forward neural network taking as input the combination of molecule and protein representations learnt by molecular graph and protein sequence encoders. We show that, on large datasets, the deep learning CN model outperforms state-of-the-art shallow methods, and competes with deep methods with expert-based descriptors. However, on small datasets, shallow methods present better prediction performance than deep learning methods. Then, we evaluate data augmentation techniques, namely multi-view and transfer learning, to improve the prediction performance of the chemogenomic neural network. We conclude that a promising research direction is to integrate heterogeneous sources of data such as auxiliary tasks for which large datasets are available, or independently, multiple molecule and protein attribute views.

17.
BMC Bioinformatics ; 9: 363, 2008 Sep 06.
Artículo en Inglés | MEDLINE | ID: mdl-18775075

RESUMEN

BACKGROUND: The G-protein coupled receptor (GPCR) superfamily is currently the largest class of therapeutic targets. In silico prediction of interactions between GPCRs and small molecules in the transmembrane ligand-binding site is therefore a crucial step in the drug discovery process, which remains a daunting task due to the difficulty to characterize the 3D structure of most GPCRs, and to the limited amount of known ligands for some members of the superfamily. Chemogenomics, which attempts to characterize interactions between all members of a target class and all small molecules simultaneously, has recently been proposed as an interesting alternative to traditional docking or ligand-based virtual screening strategies. RESULTS: We show that interaction prediction in the chemogenomics framework outperforms state-of-the-art individual ligand-based methods in accuracy both for receptor with known ligands and without known ligands. This is done with no knowledge of the receptor 3D structure. In particular we are able to predict ligands of orphan GPCRs with an estimated accuracy of 78.1%. CONCLUSION: We propose new methods for in silico chemogenomics and validate them on the virtual screening of GPCRs. The methods represent an extension of a recently proposed machine learning strategy, based on support vector machines (SVM), which provides a flexible framework to incorporate various information sources on the biological space of targets and on the chemical space of small molecules. We investigate the use of 2D and 3D descriptors for small molecules, and test a variety of descriptors for GPCRs. We show that incorporating information about the known hierarchical classification of the target family and about key residues in their inferred binding pockets significantly improves the prediction accuracy of our model.


Asunto(s)
Sistemas de Liberación de Medicamentos/métodos , Modelos Químicos , Modelos Moleculares , Preparaciones Farmacéuticas/química , Mapeo de Interacción de Proteínas/métodos , Receptores Acoplados a Proteínas G/química , Receptores Acoplados a Proteínas G/ultraestructura , Sitios de Unión , Simulación por Computador , Unión Proteica
18.
J Mol Biol ; 366(3): 868-81, 2007 Feb 23.
Artículo en Inglés | MEDLINE | ID: mdl-17196981

RESUMEN

Enzymes from the pentose phosphate pathway (PPP) are potential drug targets for the development of new drugs against Trypanosoma brucei, the causative agent of African sleeping disease: for instance, the 6-phosphogluconate dehydrogenase is currently studied actively for such purposes. Structural and functional studies are necessary to better characterize the associated enzymes and compare them to their human homologues, in order to undertake structure-based drug design studies on such targets. In this context, the crystal structure of 6-phosphogluconolactonase (6PGL) from T. brucei, the second enzyme from PPP, was determined at 2.1 Angstroms resolution. Comparison of its sequence and structure to other related proteins in the 6PGL family with a known structure (Thermotoga maritima Tm6GPL 1PBT and Vibrio cholerae Vc6PGL (1Y89), which have not been discussed in print), or in the glucosamine-6-phosphate-deaminase family (hexameric Escherichia coli 1DEA and monomeric Bacillus subtilis 2BKV), allowed the identification of the 6PGL active site. In addition to the analysis of the crystal structure, 3D NMR interaction studies and docking experiments are reported here. Key residues involved in substrate binding or in catalysis were identified.


Asunto(s)
Hidrolasas de Éster Carboxílico/química , Hidrolasas de Éster Carboxílico/metabolismo , Proteínas Protozoarias/química , Proteínas Protozoarias/metabolismo , Trypanosoma brucei brucei/enzimología , Secuencia de Aminoácidos , Animales , Sitios de Unión , Catálisis , Cristalografía por Rayos X , Dimerización , Gluconatos/química , Modelos Moleculares , Datos de Secuencia Molecular , Resonancia Magnética Nuclear Biomolecular , Estructura Secundaria de Proteína , Alineación de Secuencia , Relación Estructura-Actividad , Especificidad por Sustrato , Zinc/metabolismo
19.
PLoS One ; 13(10): e0204999, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-30286165

RESUMEN

Adverse drug reactions, also called side effects, range from mild to fatal clinical events and significantly affect the quality of care. Among other causes, side effects occur when drugs bind to proteins other than their intended target. As experimentally testing drug specificity against the entire proteome is out of reach, we investigate the application of chemogenomics approaches. We formulate the study of drug specificity as a problem of predicting interactions between drugs and proteins at the proteome scale. We build several benchmark datasets, and propose NN-MT, a multi-task Support Vector Machine (SVM) algorithm that is trained on a limited number of data points, in order to solve the computational issues or proteome-wide SVM for chemogenomics. We compare NN-MT to different state-of-the-art methods, and show that its prediction performances are similar or better, at an efficient calculation cost. Compared to its competitors, the proposed method is particularly efficient to predict (protein, ligand) interactions in the difficult double-orphan case, i.e. when no interactions are previously known for the protein nor for the ligand. The NN-MT algorithm appears to be a good default method providing state-of-the-art or better performances, in a wide range of prediction scenario that are considered in the present study: proteome-wide prediction, protein family prediction, test (protein, ligand) pairs dissimilar to pairs in the train set, and orphan cases.


Asunto(s)
Genómica , Preparaciones Farmacéuticas , Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos/diagnóstico , Preparaciones Farmacéuticas/metabolismo , Pronóstico , Máquina de Vectores de Soporte
20.
Mol Inform ; 36(10)2017 10.
Artículo en Inglés | MEDLINE | ID: mdl-28949440

RESUMEN

The development of high-throughput in vitro assays to study quantitatively the toxicity of chemical compounds on genetically characterized human-derived cell lines paves the way to predictive toxicogenetics, where one would be able to predict the toxicity of any particular compound on any particular individual. In this paper we present a machine learning-based approach for that purpose, kernel multitask regression (KMR), which combines chemical characterizations of molecular compounds with genetic and transcriptomic characterizations of cell lines to predict the toxicity of a given compound on a given cell line. We demonstrate the relevance of the method on the recent DREAM8 Toxicogenetics challenge, where it ranked among the best state-of-the-art models, and discuss the importance of choosing good descriptors for cell lines and chemicals.


Asunto(s)
Toxicogenética/métodos , Algoritmos , Animales , Humanos , Aprendizaje Automático , Análisis de Regresión , Pruebas de Toxicidad
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA