RESUMO
RepurposeDrugs (https://repurposedrugs.org/) is a comprehensive web-portal that combines a unique drug indication database with a machine learning (ML) predictor to discover new drug-indication associations for approved as well as investigational mono and combination therapies. The platform provides detailed information on treatment status, disease indications and clinical trials across 25 indication categories, including neoplasms and cardiovascular conditions. The current version comprises 4314 compounds (approved, terminated or investigational) and 161 drug combinations linked to 1756 indications/conditions, totaling 28 148 drug-disease pairs. By leveraging data on both approved and failed indications, RepurposeDrugs provides ML-based predictions for the approval potential of new drug-disease indications, both for mono- and combinatorial therapies, demonstrating high predictive accuracy in cross-validation. The validity of the ML predictor is validated through a number of real-world case studies, demonstrating its predictive power to accurately identify repurposing candidates with a high likelihood of future approval. To our knowledge, RepurposeDrugs web-portal is the first integrative database and ML-based predictor for interactive exploration and prediction of both single-drug and combination approval likelihood across indications. Given its broad coverage of indication areas and therapeutic options, we expect it accelerates many future drug repurposing projects.
Assuntos
Reposicionamento de Medicamentos , Aprendizado de Máquina , Reposicionamento de Medicamentos/métodos , Humanos , Internet , Quimioterapia Combinada , Bases de Dados de Produtos Farmacêuticos , Bases de Dados FactuaisRESUMO
MOTIVATION: Drug-target interactions (DTIs) hold a pivotal role in drug repurposing and elucidation of drug mechanisms of action. While single-targeted drugs have demonstrated clinical success, they often exhibit limited efficacy against complex diseases, such as cancers, whose development and treatment is dependent on several biological processes. Therefore, a comprehensive understanding of primary, secondary and even inactive targets becomes essential in the quest for effective and safe treatments for cancer and other indications. The human proteome offers over a thousand druggable targets, yet most FDA-approved drugs bind to only a small fraction of these targets. RESULTS: This study introduces an attention-based method (called as MMAtt-DTA) to predict drug-target bioactivities across human proteins within seven superfamilies. We meticulously examined nine different descriptor sets to identify optimal signature descriptors for predicting novel DTIs. Our testing results demonstrated Spearman correlations exceeding 0.72 (P < 0.001) for six out of seven superfamilies. The proposed method outperformed fourteen state-of-the-art machine learning, deep learning and graph-based methods and maintained relatively high performance for most target superfamilies when tested with independent bioactivity data sources. We computationally validated 185 676 drug-target pairs from ChEMBL-V33 that were not available during model training, achieving a reasonable performance with Spearman correlation >0.57 (P < 0.001) for most superfamilies. This underscores the robustness of the proposed method for predicting novel DTIs. Finally, we applied our method to predict missing bioactivities among 3492 approved molecules in ChEMBL-V33, offering a valuable tool for advancing drug mechanism discovery and repurposing existing drugs for new indications. AVAILABILITY AND IMPLEMENTATION: https://github.com/AronSchulman/MMAtt-DTA.
Assuntos
Reposicionamento de Medicamentos , Humanos , Reposicionamento de Medicamentos/métodos , Proteínas/metabolismo , Proteínas/química , Aprendizado de Máquina , Biologia Computacional/métodos , Descoberta de Drogas/métodosRESUMO
MOTIVATION: Peptide therapeutics hinge on the precise interaction between a tailored peptide and its designated receptor while mitigating interactions with alternate receptors is equally indispensable. Existing methods primarily estimate the binding score between protein and peptide pairs. However, for a specific peptide without a corresponding protein, it is challenging to identify the proteins it could bind due to the sheer number of potential candidates. RESULTS: We propose a transformers-based protein embedding scheme in this study that can quickly identify and rank millions of interacting proteins. Furthermore, the proposed approach outperforms existing sequence- and structure-based methods, with a mean AUC-ROC and AUC-PR of 0.73. AVAILABILITY AND IMPLEMENTATION: Training data, scripts, and fine-tuned parameters are available at https://github.com/RoniGurvich/Peptriever. The proposed method is linked with a web application available for customized prediction at https://peptriever.app/.
Assuntos
Peptídeos , Ligação Proteica , Proteínas , Software , Peptídeos/química , Peptídeos/metabolismo , Proteínas/química , Proteínas/metabolismo , Algoritmos , Biologia Computacional/métodos , Bases de Dados de ProteínasRESUMO
Functional precision medicine (fPM) offers an exciting, simplified approach to finding the right applications for existing molecules and enhancing therapeutic potential. Integrative and robust tools ensuring high accuracy and reliability of the results are critical. In response to this need, we previously developed Breeze, a drug screening data analysis pipeline, designed to facilitate quality control, dose-response curve fitting, and data visualization in a user-friendly manner. Here, we describe the latest version of Breeze (release 2.0), which implements an array of advanced data exploration capabilities, providing users with comprehensive post-analysis and interactive visualization options that are essential for minimizing false positive/negative outcomes and ensuring accurate interpretation of drug sensitivity and resistance data. The Breeze 2.0 web-tool also enables integrative analysis and cross-comparison of user-uploaded data with publicly available drug response datasets. The updated version incorporates new drug quantification metrics, supports analysis of both multi-dose and single-dose drug screening data and introduces a redesigned, intuitive user interface. With these enhancements, Breeze 2.0 is anticipated to substantially broaden its potential applications in diverse domains of fPM.
Assuntos
Avaliação Pré-Clínica de Medicamentos , Software , Gráficos por Computador , Reprodutibilidade dos Testes , Interface Usuário-Computador , InternetRESUMO
Chemosensitivity assays are commonly used for preclinical drug discovery and clinical trial optimization. However, data from independent assays are often discordant, largely attributed to uncharacterized variation in the experimental materials and protocols. We report here the launching of Minimal Information for Chemosensitivity Assays (MICHA), accessed via https://micha-protocol.org. Distinguished from existing efforts that are often lacking support from data integration tools, MICHA can automatically extract publicly available information to facilitate the assay annotation including: 1) compounds, 2) samples, 3) reagents and 4) data processing methods. For example, MICHA provides an integrative web server and database to obtain compound annotation including chemical structures, targets and disease indications. In addition, the annotation of cell line samples, assay protocols and literature references can be greatly eased by retrieving manually curated catalogues. Once the annotation is complete, MICHA can export a report that conforms to the FAIR principle (Findable, Accessible, Interoperable and Reusable) of drug screening studies. To consolidate the utility of MICHA, we provide FAIRified protocols from five major cancer drug screening studies as well as six recently conducted COVID-19 studies. With the MICHA web server and database, we envisage a wider adoption of a community-driven effort to improve the open access of drug sensitivity assays.
RESUMO
Drug development involves a deep understanding of the mechanisms of action and possible side effects of each drug, and sometimes results in the identification of new and unexpected uses for drugs, termed as drug repurposing. Both in case of serendipitous observations and systematic mechanistic explorations, confirmation of new indications for a drug requires hypothesis building around relevant drug-related data, such as molecular targets involved, and patient and cellular responses. These datasets are available in public repositories, but apart from sifting through the sheer amount of data imposing computational bottleneck, a major challenge is the difficulty in selecting which databases to use from an increasingly large number of available databases. The database selection is made harder by the lack of an overview of the types of data offered in each database. In order to alleviate these problems and to guide the end user through the drug repurposing efforts, we provide here a survey of 102 of the most promising and drug-relevant databases reported to date. We summarize the target coverage and types of data available in each database and provide several examples of how multi-database exploration can facilitate drug repurposing.
Assuntos
Bases de Dados Factuais , Reposicionamento de Medicamentos , Biologia Computacional/métodos , Sistemas de Liberação de Medicamentos , Inquéritos e QuestionáriosRESUMO
BACKGROUND: Drug-target interactions (DTIs) are critical for drug repurposing and elucidation of drug mechanisms, and are manually curated by large databases, such as ChEMBL, BindingDB, DrugBank and DrugTargetCommons. However, the number of curated articles likely constitutes only a fraction of all the articles that contain experimentally determined DTIs. Finding such articles and extracting the experimental information is a challenging task, and there is a pressing need for systematic approaches to assist the curation of DTIs. To this end, we applied Bidirectional Encoder Representations from Transformers (BERT) to identify such articles. Because DTI data intimately depends on the type of assays used to generate it, we also aimed to incorporate functions to predict the assay format. RESULTS: Our novel method identified 0.6 million articles (along with drug and protein information) which are not previously included in public DTI databases. Using 10-fold cross-validation, we obtained ~ 99% accuracy for identifying articles containing quantitative drug-target profiles. The F1 micro for the prediction of assay format is 88%, which leaves room for improvement in future studies. CONCLUSION: The BERT model in this study is robust and the proposed pipeline can be used to identify previously overlooked articles containing quantitative DTIs. Overall, our method provides a significant advancement in machine-assisted DTI extraction and curation. We expect it to be a useful addition to drug mechanism discovery and repurposing.
Assuntos
Reposicionamento de Medicamentos , Proteínas , Bases de Dados Factuais , Interações Medicamentosas , Proteínas/metabolismo , PubMedRESUMO
Knowledge of the full target space of drugs (or drug-like compounds) provides important insights into the potential therapeutic use of the agents to modulate or avoid their various on- and off-targets in drug discovery and precision medicine. However, there is a lack of consolidated databases and associated data exploration tools that allow for systematic profiling of drug target-binding potencies of both approved and investigational agents using a network-centric approach. We recently initiated a community-driven platform, Drug Target Commons (DTC), which is an open-data crowdsourcing platform designed to improve the management, reproducibility and extended use of compound-target bioactivity data for drug discovery and repurposing, as well as target identification applications. In this work, we demonstrate an integrated use of the rich bioactivity data from DTC and related drug databases using Drug Target Profiler (DTP), an open-source software and web tool for interactive exploration of drug-target interaction networks. DTP was designed for network-centric modeling of mode-of-action of multi-targeting anticancer compounds, especially for precision oncology applications. DTP enables users to construct an interaction network based on integrated bioactivity data across selected chemical compounds and their protein targets, further customizable using various visualization and filtering options, as well as cross-links to several drug and protein databases to provide comprehensive information of the network nodes and interactions. We demonstrate here the operation of the DTP tool and its unique features by several use cases related to both drug discovery and drug repurposing applications, using examples of anticancer drugs with shared target profiles. DTP is freely accessible at http://drugtargetprofiler.fimm.fi/.
RESUMO
Molecular and functional profiling of cancer cell lines is subject to laboratory-specific experimental practices and data analysis protocols. The current challenge therefore is how to make an integrated use of the omics profiles of cancer cell lines for reliable biological discoveries. Here, we carried out a systematic analysis of nine types of data modalities using meta-analysis of 53 omics studies across 12 research laboratories for 2,018 cell lines. To account for a relatively low consistency observed for certain data modalities, we developed a robust data integration approach that identifies reproducible signals shared among multiple data modalities and studies. We demonstrated the power of the integrative analyses by identifying a novel driver gene, ECHDC1, with tumor suppressive role validated both in breast cancer cells and patient tumors. The multi-modal meta-analysis approach also identified synthetic lethal partners of cancer drivers, including a co-dependency of PTEN deficient endometrial cancer cells on RNA helicases.
Assuntos
Genes Supressores de Tumor , Genômica , Algoritmos , Neoplasias da Mama/genética , Linhagem Celular Tumoral , Bases de Dados Genéticas , Epistasia Genética , Feminino , Humanos , Espectrometria de Massas , Reprodutibilidade dos Testes , Mutações Sintéticas LetaisRESUMO
Celecoxib, or Celebrex, a nonsteroidal anti-inflammatory drug, is one of the most common medicines for treating inflammatory diseases. Recently, it has been shown that celecoxib is associated with implications in complex diseases, such as Alzheimer disease and cancer as well as with cardiovascular risk assessment and toxicity, suggesting that celecoxib may affect multiple unknown targets. In this project, we detected targets of celecoxib within the nervous system using a label-free thermal proteome profiling method. First, proteins of the rat hippocampus were treated with multiple drug concentrations and temperatures. Next, we separated the soluble proteins from the denatured and sedimented total protein load by ultracentrifugation. Subsequently, the soluble proteins were analyzed by nano-liquid chromatography tandem mass spectrometry to determine the identity of the celecoxib-targeted proteins based on structural changes by thermal stability variation of targeted proteins toward higher solubility in the higher temperatures. In the analysis of the soluble protein extract at 67°C, 44 proteins were uniquely detected in drug-treated samples out of all 478 identified proteins at this temperature. Ras-associated binding protein 4a, 1 out of these 44 proteins, has previously been reported as one of the celecoxib off targets in the rat central nervous system. Furthermore, we provide more molecular details through biomedical enrichment analysis to explore the potential role of all detected proteins in the biologic systems. We show that the determined proteins play a role in the signaling pathways related to neurodegenerative disease-and cancer pathways. Finally, we fill out molecular supporting evidence for using celecoxib toward the drug-repurposing approach by exploring drug targets. SIGNIFICANCE STATEMENT: This study determined 44 off-target proteins of celecoxib, a nonsteroidal anti-inflammatory and one of the most common medicines for treating inflammatory diseases. It shows that these proteins play a role in the signaling pathways related to neurodegenerative disease and cancer pathways. Finally, the study provides molecular supporting evidence for using celecoxib toward the drug-repurposing approach by exploring drug targets.
Assuntos
Anti-Inflamatórios não Esteroides/farmacologia , Celecoxib/farmacologia , Hipocampo/efeitos dos fármacos , Hipocampo/metabolismo , Proteínas/metabolismo , Proteoma/metabolismo , Animais , Cromatografia Líquida/métodos , Perfilação da Expressão Gênica/métodos , Humanos , Masculino , Doenças Neurodegenerativas/tratamento farmacológico , Doenças Neurodegenerativas/metabolismo , Ratos , Solubilidade/efeitos dos fármacos , Espectrometria de Massas em Tandem/métodos , TemperaturaRESUMO
Drug combination therapy has the potential to enhance efficacy, reduce dose-dependent toxicity and prevent the emergence of drug resistance. However, discovery of synergistic and effective drug combinations has been a laborious and often serendipitous process. In recent years, identification of combination therapies has been accelerated due to the advances in high-throughput drug screening, but informatics approaches for systems-level data management and analysis are needed. To contribute toward this goal, we created an open-access data portal called DrugComb (https://drugcomb.fimm.fi) where the results of drug combination screening studies are accumulated, standardized and harmonized. Through the data portal, we provided a web server to analyze and visualize users' own drug combination screening data. The users can also effectively participate a crowdsourcing data curation effect by depositing their data at DrugComb. To initiate the data repository, we collected 437 932 drug combinations tested on a variety of cancer cell lines. We showed that linear regression approaches, when considering chemical fingerprints as predictors, have the potential to achieve high accuracy of predicting the sensitivity of drug combinations. All the data and informatics tools are freely available in DrugComb to enable a more efficient utilization of data resources for future drug combination discovery.
Assuntos
Antineoplásicos/uso terapêutico , Protocolos de Quimioterapia Combinada Antineoplásica/uso terapêutico , Sinergismo Farmacológico , Neoplasias/tratamento farmacológico , Biologia Computacional , Descoberta de Drogas , Avaliação Pré-Clínica de Medicamentos , HumanosRESUMO
BACKGROUND: Dispersed biomedical databases limit user exploration to generate structured knowledge. Linked Data unifies data structures and makes the dispersed data easy to search across resources, but it lacks supporting human cognition to achieve insights. In addition, potential errors in the data are difficult to detect in their free formats. Devising a visualization that synthesizes multiple sources in such a way that links between data sources are transparent, and uncertainties, such as data conflicts, are salient is challenging. RESULTS: To investigate the requirements and challenges of uncertainty-aware visualizations of linked data, we developed MediSyn, a system that synthesizes medical datasets to support drug treatment selection. It uses a matrix-based layout to visually link drugs, targets (e.g., mutations), and tumor types. Data uncertainties are salient in MediSyn; for example, (i) missing data are exposed in the matrix view of drug-target relations; (ii) inconsistencies between datasets are shown via overlaid layers; and (iii) data credibility is conveyed through links to data provenance. CONCLUSIONS: Through the synthesis of two manually curated datasets, cancer treatment biomarkers and drug-target bioactivities, a use case shows how MediSyn effectively supports the discovery of drug-repurposing opportunities. A study with six domain experts indicated that MediSyn benefited the drug selection and data inconsistency discovery. Though linked publication sources supported user exploration for further information, the causes of inconsistencies were not easy to find. Additionally, MediSyn could embrace more patient data to increase its informativeness. We derive design implications from the findings.
Assuntos
Bases de Dados Factuais , Tratamento Farmacológico , Software , Incerteza , Adulto , Feminino , Humanos , Inquéritos e QuestionáriosRESUMO
Motivation: Drug-target interactions (DTIs) play a pivotal role in drug discovery, as it aims to identify potential drug targets and elucidate their mechanism of action. In recent years, the application of natural language processing (NLP), particularly when combined with pre-trained language models, has gained considerable momentum in the biomedical domain, with the potential to mine vast amounts of texts to facilitate the efficient extraction of DTIs from the literature. Results: In this article, we approach the task of DTIs as an entity-relationship extraction problem, utilizing different pre-trained transformer language models, such as BERT, to extract DTIs. Our results indicate that an ensemble approach, by combining gene descriptions from the Entrez Gene database with chemical descriptions from the Comparative Toxicogenomics Database (CTD), is critical for achieving optimal performance. The proposed model achieves an F1 score of 80.6 on the hidden DrugProt test set, which is the top-ranked performance among all the submitted models in the official evaluation. Furthermore, we conduct a comparative analysis to evaluate the effectiveness of various gene textual descriptions sourced from Entrez Gene and UniProt databases to gain insights into their impact on the performance. Our findings highlight the potential of NLP-based text mining using gene and chemical descriptions to improve drug-target extraction tasks. Availability and implementation: Datasets utilized in this study are accessible at https://dtis.drugtargetcommons.org/.
RESUMO
The drug development process consumes 9-12 years and approximately one billion US dollars in costs. Due to the high finances and time costs required by the traditional drug discovery paradigm, repurposing old drugs to treat cancer and rare diseases is becoming popular. Computational approaches are mainly data-driven and involve a systematic analysis of different data types leading to the formulation of repurposing hypotheses. This study presents a novel scoring algorithm based on chemical and genomic data to repurpose drugs for 669 diseases from 22 groups, including various cancers, musculoskeletal, infections, cardiovascular, and skin diseases. The data types used to design the scoring algorithm are chemical structures, drug-target interactions (DTI), pathways, and disease-gene associations. The repurposed scoring algorithm is strengthened by integrating the most comprehensive manually curated datasets for each data type. At DrugRepo score ≥ 0.4, we repurposed 516 approved drugs across 545 diseases. Moreover, hundreds of novel predicted compounds can be matched with ongoing studies at clinical trials. Our analysis is supported by a web tool available at: http://drugrepo.org/ .
Assuntos
GenômicaRESUMO
Combinatorial therapies have been recently proposed to improve the efficacy of anticancer treatment. The SynergyFinder R package is a software used to analyze pre-clinical drug combination datasets. Here, we report the major updates to the SynergyFinder R package for improved interpretation and annotation of drug combination screening results. Unlike the existing implementations, the updated SynergyFinder R package includes five main innovations. 1) We extend the mathematical models to higher-order drug combination data analysis and implement dimension reduction techniques for visualizing the synergy landscape. 2) We provide a statistical analysis of drug combination synergy and sensitivity with confidence intervals and P values. 3) We incorporate a synergy barometer to harmonize multiple synergy scoring methods to provide a consensus metric for synergy. 4) We evaluate drug combination synergy and sensitivity to provide an unbiased interpretation of the clinical potential. 5) We enable fast annotation of drugs and cell lines, including their chemical and target information. These annotations will improve the interpretation of the mechanisms of action of drug combinations. To facilitate the use of the R package within the drug discovery community, we also provide a web server at www.synergyfinderplus.org as a user-friendly interface to enable a more flexible and versatile analysis of drug combination data.
Assuntos
Modelos Teóricos , Software , Sinergismo Farmacológico , Combinação de Medicamentos , Linhagem CelularRESUMO
The Columbia Cancer Target Discovery and Development (CTD2) Center is developing PANACEA, a resource comprising dose-responses and RNA sequencing (RNA-seq) profiles of 25 cell lines perturbed with â¼400 clinical oncology drugs, to study a tumor-specific drug mechanism of action. Here, this resource serves as the basis for a DREAM Challenge assessing the accuracy and sensitivity of computational algorithms for de novo drug polypharmacology predictions. Dose-response and perturbational profiles for 32 kinase inhibitors are provided to 21 teams who are blind to the identity of the compounds. The teams are asked to predict high-affinity binding targets of each compound among â¼1,300 targets cataloged in DrugBank. The best performing methods leverage gene expression profile similarity analysis as well as deep-learning methodologies trained on individual datasets. This study lays the foundation for future integrative analyses of pharmacogenomic data, reconciliation of polypharmacology effects in different tumor contexts, and insights into network-based assessments of drug mechanisms of action.
Assuntos
Neoplasias/tratamento farmacológico , Polifarmacologia , Algoritmos , Perfilação da Expressão Gênica , Regulação Neoplásica da Expressão Gênica , Humanos , Redes Neurais de Computação , Proteínas Quinases/metabolismo , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Transcrição GênicaRESUMO
Introduction: Drug repurposing provides a cost-effective strategy to re-use approved drugs for new medical indications. Several machine learning (ML) and artificial intelligence (AI) approaches have been developed for systematic identification of drug repurposing leads based on big data resources, hence further accelerating and de-risking the drug development process by computational means.Areas covered: The authors focus on supervised ML and AI methods that make use of publicly available databases and information resources. While most of the example applications are in the field of anticancer drug therapies, the methods and resources reviewed are widely applicable also to other indications including COVID-19 treatment. A particular emphasis is placed on the use of comprehensive target activity profiles that enable a systematic repurposing process by extending the target profile of drugs to include potent off-targets with therapeutic potential for a new indication.Expert opinion: The scarcity of clinical patient data and the current focus on genetic aberrations as primary drug targets may limit the performance of anticancer drug repurposing approaches that rely solely on genomics-based information. Functional testing of cancer patient cells exposed to a large number of targeted therapies and their combinations provides an additional source of repurposing information for tissue-aware AI approaches.
Assuntos
Inteligência Artificial , Reposicionamento de Medicamentos/métodos , Neoplasias/tratamento farmacológico , Antineoplásicos/farmacologia , Big Data , Análise Custo-Benefício , Desenvolvimento de Medicamentos/economia , Desenvolvimento de Medicamentos/métodos , Reposicionamento de Medicamentos/economia , Genômica/métodos , Humanos , Aprendizado de Máquina , Neoplasias/genética , Tratamento Farmacológico da COVID-19RESUMO
Chemosensitivity assays are commonly used for preclinical drug discovery and clinical trial optimization. However, data from independent assays are often discordant, largely attributed to uncharacterized variation in the experimental materials and protocols. We report here the launching of MICHA (Minimal Information for Chemosensitivity Assays), accessed via https://micha-protocol.org. Distinguished from existing efforts that are often lacking support from data integration tools, MICHA can automatically extract publicly available information to facilitate the assay annotation including: 1) compounds, 2) samples, 3) reagents, and 4) data processing methods. For example, MICHA provides an integrative web server and database to obtain compound annotation including chemical structures, targets, and disease indications. In addition, the annotation of cell line samples, assay protocols and literature references can be greatly eased by retrieving manually curated catalogues. Once the annotation is complete, MICHA can export a report that conforms to the FAIR principle (Findable, Accessible, Interoperable and Reusable) of drug screening studies. To consolidate the utility of MICHA, we provide FAIRified protocols from five major cancer drug screening studies, as well as six recently conducted COVID-19 studies. With the MICHA webserver and database, we envisage a wider adoption of a community-driven effort to improve the open access of drug sensitivity assays.
RESUMO
Despite decades of intensive search for compounds that modulate the activity of particular protein targets, a large proportion of the human kinome remains as yet undrugged. Effective approaches are therefore required to map the massive space of unexplored compound-kinase interactions for novel and potent activities. Here, we carry out a crowdsourced benchmarking of predictive algorithms for kinase inhibitor potencies across multiple kinase families tested on unpublished bioactivity data. We find the top-performing predictions are based on various models, including kernel learning, gradient boosting and deep learning, and their ensemble leads to a predictive accuracy exceeding that of single-dose kinase activity assays. We design experiments based on the model predictions and identify unexpected activities even for under-studied kinases, thereby accelerating experimental mapping efforts. The open-source prediction algorithms together with the bioactivities between 95 compounds and 295 kinases provide a resource for benchmarking prediction algorithms and for extending the druggable kinome.
Assuntos
Inibidores de Proteínas Quinases/farmacologia , Proteínas Quinases/metabolismo , Algoritmos , Benchmarking , Crowdsourcing , Bases de Dados de Produtos Farmacêuticos , Aprendizado Profundo , Descoberta de Drogas , Avaliação Pré-Clínica de Medicamentos , Humanos , Cinética , Aprendizado de Máquina , Modelos Biológicos , Modelos Químicos , Inibidores de Proteínas Quinases/química , Inibidores de Proteínas Quinases/farmacocinética , Proteínas Quinases/química , Proteômica , Análise de RegressãoRESUMO
We conduct a cartography of rhodopsin-like non-olfactory G protein-coupled receptors in the Ensembl database. The most recent genomic data (releases 90-92, 90 vertebrate genomes) are analyzed through the online interface and receptors mapped on phylogenetic guide trees that were constructed based on a set of ~14.000 amino acid sequences. This snapshot of genomic data suggest vertebrate genomes to harbour 142 clades of GPCRs without human orthologues. Among those, 69 have not to our knowledge been mentioned or studied previously in the literature, of which 28 are distant from existing receptors and likely new orphans. These newly identified receptors are candidates for more focused evolutionary studies such as chromosomal mapping as well for in-depth pharmacological characterization. Interestingly, we also show that 37 of the 72 human orphan (or recently deorphanized) receptors included in this study cluster into nineteen closely related groups, which implies that there are less ligands to be identified than previously anticipated. Altogether, this work has significant implications when discussing nomenclature issues for GPCRs.