Pesquisa | BVS IEC

1.

Pharos 2023: an integrated resource for the understudied human proteome.

Kelleher, Keith J; Sheils, Timothy K; Mathias, Stephen L; Yang, Jeremy J; Metzger, Vincent T; Siramshetty, Vishal B; Nguyen, Dac-Trung; Jensen, Lars Juhl; Vidovic, Dusica; Schürer, Stephan C; Holmes, Jayme; Sharma, Karlie R; Pillai, Ajay; Bologa, Cristian G; Edwards, Jeremy S; Mathé, Ewy A; Oprea, Tudor I.

Nucleic Acids Res ; 51(D1): D1405-D1416, 2023 01 06.

Artigo em Inglês | MEDLINE | ID: mdl-36624666

RESUMO

The Illuminating the Druggable Genome (IDG) project aims to improve our understanding of understudied proteins and our ability to study them in the context of disease biology by perturbing them with small molecules, biologics, or other therapeutic modalities. Two main products from the IDG effort are the Target Central Resource Database (TCRD) (http://juniper.health.unm.edu/tcrd/), which curates and aggregates information, and Pharos (https://pharos.nih.gov/), a web interface for fusers to extract and visualize data from TCRD. Since the 2021 release, TCRD/Pharos has focused on developing visualization and analysis tools that help reveal higher-level patterns in the underlying data. The current iterations of TCRD and Pharos enable users to perform enrichment calculations based on subsets of targets, diseases, or ligands and to create interactive heat maps and UpSet charts of many types of annotations. Using several examples, we show how to address disease biology and drug discovery questions through enrichment calculations and UpSet charts.

Assuntos

Bases de Dados Factuais , Terapia de Alvo Molecular , Proteoma , Humanos , Produtos Biológicos , Descoberta de Drogas , Internet , Proteoma/efeitos dos fármacos

2.

RaMP-DB 2.0: a renovated knowledgebase for deriving biological and chemical insight from metabolites, proteins, and genes.

Braisted, John; Patt, Andrew; Tindall, Cole; Sheils, Timothy; Neyra, Jorge; Spencer, Kyle; Eicher, Tara; Mathé, Ewy A.

Bioinformatics ; 39(1)2023 01 01.

Artigo em Inglês | MEDLINE | ID: mdl-36373969

RESUMO

MOTIVATION: Functional interpretation of high-throughput metabolomic and transcriptomic results is a crucial step in generating insight from experimental data. However, pathway and functional information for genes and metabolites are distributed among many siloed resources, limiting the scope of analyses that rely on a single knowledge source. RESULTS: RaMP-DB 2.0 is a web interface, relational database, API and R package designed for straightforward and comprehensive functional interpretation of metabolomic and multi-omic data. RaMP-DB 2.0 has been upgraded with an expanded breadth and depth of functional and chemical annotations (ClassyFire, LIPID MAPS, SMILES, InChIs, etc.), with new data types related to metabolites and lipids incorporated. To streamline entity resolution across multiple source databases, we have implemented a new semi-automated process, thereby lessening the burden of harmonization and supporting more frequent updates. The associated RaMP-DB 2.0 R package now supports queries on pathways, common reactions (e.g. metabolite-enzyme relationship), chemical functional ontologies, chemical classes and chemical structures, as well as enrichment analyses on pathways (multi-omic) and chemical classes. Lastly, the RaMP-DB web interface has been completely redesigned using the Angular framework. AVAILABILITY AND IMPLEMENTATION: The code used to build all components of RaMP-DB 2.0 are freely available on GitHub at https://github.com/ncats/ramp-db, https://github.com/ncats/RaMP-Client/ and https://github.com/ncats/RaMP-Backend. The RaMP-DB web application can be accessed at https://rampdb.nih.gov/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Metabolômica , Software , Bases de Dados Factuais , Perfilação da Expressão Gênica , Bases de Conhecimento , Proteínas

3.

NCATS Inxight Drugs: a comprehensive and curated portal for translational research.

Siramshetty, Vishal B; Grishagin, Ivan; Nguyen, Ðac-Trung; Peryea, Tyler; Skovpen, Yulia; Stroganov, Oleg; Katzel, Daniel; Sheils, Timothy; Jadhav, Ajit; Mathé, Ewy A; Southall, Noel T.

Nucleic Acids Res ; 50(D1): D1307-D1316, 2022 01 07.

Artigo em Inglês | MEDLINE | ID: mdl-34648031

RESUMO

The United States has a complex regulatory scheme for marketing drugs. Understanding drug regulatory status is a daunting task that requires integrating data from many sources from the United States Food and Drug Administration (FDA), US government publications, and other processes related to drug development. At NCATS, we created Inxight Drugs (https://drugs.ncats.io), a web resource that attempts to address this challenge in a systematic manner. NCATS Inxight Drugs incorporates and unifies a wealth of data, including those supplied by the FDA and from independent public sources. The database offers a substantial amount of manually curated literature data unavailable from other sources. Currently, the database contains 125 036 product ingredients, including 2566 US approved drugs, 6242 marketed drugs, and 9684 investigational drugs. All substances are rigorously defined according to the ISO 11238 standard to comply with existing regulatory standards for unique drug substance identification. A special emphasis was placed on capturing manually curated and referenced data on treatment modalities and semantic relationships between substances. A supplementary resource 'Novel FDA Drug Approvals' features regulatory details of newly approved FDA drugs. The database is regularly updated using NCATS Stitcher data integration tool that automates data aggregation and supports full data access through a RESTful API.

Assuntos

Bases de Dados Factuais , Bases de Dados de Produtos Farmacêuticos , Preparações Farmacêuticas/classificação , United States Food and Drug Administration , Humanos , National Center for Advancing Translational Sciences (U.S.) , Pesquisa Translacional Biomédica/classificação , Estados Unidos

4.

TCRD and Pharos 2021: mining the human proteome for disease biology.

Sheils, Timothy K; Mathias, Stephen L; Kelleher, Keith J; Siramshetty, Vishal B; Nguyen, Dac-Trung; Bologa, Cristian G; Jensen, Lars Juhl; Vidovic, Dusica; Koleti, Amar; Schürer, Stephan C; Waller, Anna; Yang, Jeremy J; Holmes, Jayme; Bocci, Giovanni; Southall, Noel; Dharkar, Poorva; Mathé, Ewy; Simeonov, Anton; Oprea, Tudor I.

Nucleic Acids Res ; 49(D1): D1334-D1346, 2021 01 08.

Artigo em Inglês | MEDLINE | ID: mdl-33156327

RESUMO

In 2014, the National Institutes of Health (NIH) initiated the Illuminating the Druggable Genome (IDG) program to identify and improve our understanding of poorly characterized proteins that can potentially be modulated using small molecules or biologics. Two resources produced from these efforts are: The Target Central Resource Database (TCRD) (http://juniper.health.unm.edu/tcrd/) and Pharos (https://pharos.nih.gov/), a web interface to browse the TCRD. The ultimate goal of these resources is to highlight and facilitate research into currently understudied proteins, by aggregating a multitude of data sources, and ranking targets based on the amount of data available, and presenting data in machine learning ready format. Since the 2017 release, both TCRD and Pharos have produced two major releases, which have incorporated or expanded an additional 25 data sources. Recently incorporated data types include human and viral-human protein-protein interactions, protein-disease and protein-phenotype associations, and drug-induced gene signatures, among others. These aggregated data have enabled us to generate new visualizations and content sections in Pharos, in order to empower users to find new areas of study in the druggable genome.

Assuntos

Bases de Dados Factuais , Genoma Humano , Doenças Neurodegenerativas/genética , Proteômica/métodos , Software , Viroses/genética , Animais , Anticonvulsivantes/química , Anticonvulsivantes/uso terapêutico , Antivirais/química , Antivirais/uso terapêutico , Produtos Biológicos/química , Produtos Biológicos/uso terapêutico , Mineração de Dados/estatística & dados numéricos , Interações Hospedeiro-Patógeno/efeitos dos fármacos , Interações Hospedeiro-Patógeno/genética , Humanos , Internet , Aprendizado de Máquina/estatística & dados numéricos , Camundongos , Camundongos Knockout , Terapia de Alvo Molecular/métodos , Doenças Neurodegenerativas/classificação , Doenças Neurodegenerativas/tratamento farmacológico , Doenças Neurodegenerativas/virologia , Mapeamento de Interação de Proteínas , Proteoma/agonistas , Proteoma/antagonistas & inibidores , Proteoma/genética , Proteoma/metabolismo , Bibliotecas de Moléculas Pequenas/química , Bibliotecas de Moléculas Pequenas/uso terapêutico , Viroses/classificação , Viroses/tratamento farmacológico , Viroses/virologia

5.

Novel Consensus Architecture To Improve Performance of Large-Scale Multitask Deep Learning QSAR Models.

Zakharov, Alexey V; Zhao, Tongan; Nguyen, Dac-Trung; Peryea, Tyler; Sheils, Timothy; Yasgar, Adam; Huang, Ruili; Southall, Noel; Simeonov, Anton.

J Chem Inf Model ; 59(11): 4613-4624, 2019 11 25.

Artigo em Inglês | MEDLINE | ID: mdl-31584270

RESUMO

Advances in the development of high-throughput screening and automated chemistry have rapidly accelerated the production of chemical and biological data, much of them freely accessible through literature aggregator services such as ChEMBL and PubChem. Here, we explore how to use this comprehensive mapping of chemical biology space to support the development of large-scale quantitative structure-activity relationship (QSAR) models. We propose a new deep learning consensus architecture (DLCA) that combines consensus and multitask deep learning approaches together to generate large-scale QSAR models. This method improves knowledge transfer across different target/assays while also integrating contributions from models based on different descriptors. The proposed approach was validated and compared with proteochemometrics, multitask deep learning, and Random Forest methods paired with various descriptors types. DLCA models demonstrated improved prediction accuracy for both regression and classification tasks. The best models together with their modeling sets are provided through publicly available web services at https://predictor.ncats.io .

Assuntos

Aprendizado Profundo , Descoberta de Drogas/métodos , Relação Quantitativa Estrutura-Atividade , Humanos , Modelos Biológicos , Sistemas On-Line , Software

6.

Pharos: Collating protein information to shed light on the druggable genome.

Nguyen, Dac-Trung; Mathias, Stephen; Bologa, Cristian; Brunak, Soren; Fernandez, Nicolas; Gaulton, Anna; Hersey, Anne; Holmes, Jayme; Jensen, Lars Juhl; Karlsson, Anneli; Liu, Guixia; Ma'ayan, Avi; Mandava, Geetha; Mani, Subramani; Mehta, Saurabh; Overington, John; Patel, Juhee; Rouillard, Andrew D; Schürer, Stephan; Sheils, Timothy; Simeonov, Anton; Sklar, Larry A; Southall, Noel; Ursu, Oleg; Vidovic, Dusica; Waller, Anna; Yang, Jeremy; Jadhav, Ajit; Oprea, Tudor I; Guha, Rajarshi.

Nucleic Acids Res ; 45(D1): D995-D1002, 2017 01 04.

Artigo em Inglês | MEDLINE | ID: mdl-27903890

RESUMO

The 'druggable genome' encompasses several protein families, but only a subset of targets within them have attracted significant research attention and thus have information about them publicly available. The Illuminating the Druggable Genome (IDG) program was initiated in 2014, has the goal of developing experimental techniques and a Knowledge Management Center (KMC) that would collect and organize information about protein targets from four families, representing the most common druggable targets with an emphasis on understudied proteins. Here, we describe two resources developed by the KMC: the Target Central Resource Database (TCRD) which collates many heterogeneous gene/protein datasets and Pharos (https://pharos.nih.gov), a multimodal web interface that presents the data from TCRD. We briefly describe the types and sources of data considered by the KMC and then highlight features of the Pharos interface designed to enable intuitive access to the IDG knowledgebase. The aim of Pharos is to encourage 'serendipitous browsing', whereby related, relevant information is made easily discoverable. We conclude by describing two use cases that highlight the utility of Pharos and TCRD.

Assuntos

Bases de Dados Genéticas , Descoberta de Drogas , Genômica , Farmacogenética , Ferramenta de Busca , Análise por Conglomerados , Biologia Computacional/métodos , Descoberta de Drogas/métodos , Genômica/métodos , Humanos , Obesidade/tratamento farmacológico , Obesidade/genética , Obesidade/metabolismo , Farmacogenética/métodos , Software , Navegador

7.

Overview of the Knowledge Management Center for Illuminating the Druggable Genome.

Oprea, Tudor I; Bologa, Cristian; Holmes, Jayme; Mathias, Stephen; Metzger, Vincent T; Waller, Anna; Yang, Jeremy J; Leach, Andrew R; Jensen, Lars Juhl; Kelleher, Keith J; Sheils, Timothy K; Mathé, Ewy; Avram, Sorin; Edwards, Jeremy S.

Drug Discov Today ; 29(3): 103882, 2024 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-38218214

RESUMO

The Knowledge Management Center (KMC) for the Illuminating the Druggable Genome (IDG) project aims to aggregate, update, and articulate protein-centric data knowledge for the entire human proteome, with emphasis on the understudied proteins from the three IDG protein families. KMC collates and analyzes data from over 70 resources to compile the Target Central Resource Database (TCRD), which is the web-based informatics platform (Pharos). These data include experimental, computational, and text-mined information on protein structures, compound interactions, and disease and phenotype associations. Based on this knowledge, proteins are classified into different Target Development Levels (TDLs) for identification of understudied targets. Additional work by the KMC focuses on enriching target knowledge and producing DrugCentral and other data visualization tools for expanding investigation of understudied targets.

Assuntos

Genoma , Gestão do Conhecimento , Humanos , Proteoma , Bases de Dados Factuais , Informática

8.

TIN-X version 3: update with expanded dataset and modernized architecture for enhanced illumination of understudied targets.

Metzger, Vincent T; Cannon, Daniel C; Yang, Jeremy J; Mathias, Stephen L; Bologa, Cristian G; Waller, Anna; Schürer, Stephan C; Vidovic, Dusica; Kelleher, Keith J; Sheils, Timothy K; Jensen, Lars Juhl; Lambert, Christophe G; Oprea, Tudor I; Edwards, Jeremy S.

PeerJ ; 12: e17470, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-38948230

RESUMO

TIN-X (Target Importance and Novelty eXplorer) is an interactive visualization tool for illuminating associations between diseases and potential drug targets and is publicly available at newdrugtargets.org. TIN-X uses natural language processing to identify disease and protein mentions within PubMed content using previously published tools for named entity recognition (NER) of gene/protein and disease names. Target data is obtained from the Target Central Resource Database (TCRD). Two important metrics, novelty and importance, are computed from this data and when plotted as log(importance) vs. log(novelty), aid the user in visually exploring the novelty of drug targets and their associated importance to diseases. TIN-X Version 3.0 has been significantly improved with an expanded dataset, modernized architecture including a REST API, and an improved user interface (UI). The dataset has been expanded to include not only PubMed publication titles and abstracts, but also full-text articles when available. This results in approximately 9-fold more target/disease associations compared to previous versions of TIN-X. Additionally, the TIN-X database containing this expanded dataset is now hosted in the cloud via Amazon RDS. Recent enhancements to the UI focuses on making it more intuitive for users to find diseases or drug targets of interest while providing a new, sortable table-view mode to accompany the existing plot-view mode. UI improvements also help the user browse the associated PubMed publications to explore and understand the basis of TIN-X's predicted association between a specific disease and a target of interest. While implementing these upgrades, computational resources are balanced between the webserver and the user's web browser to achieve adequate performance while accommodating the expanded dataset. Together, these advances aim to extend the duration that users can benefit from TIN-X while providing both an expanded dataset and new features that researchers can use to better illuminate understudied proteins.

Assuntos

Interface Usuário-Computador , Humanos , Processamento de Linguagem Natural , PubMed , Software

9.

Getting Started with the IDG KMC Datasets and Tools.

Kropiwnicki, Eryk; Binder, Jessica L; Yang, Jeremy J; Holmes, Jayme; Lachmann, Alexander; Clarke, Daniel J B; Sheils, Timothy; Kelleher, Keith J; Metzger, Vincent T; Bologa, Cristian G; Oprea, Tudor I; Ma'ayan, Avi.

Curr Protoc ; 2(1): e355, 2022 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-35085427

RESUMO

The Illuminating the Druggable Genome (IDG) consortium is a National Institutes of Health (NIH) Common Fund program designed to enhance our knowledge of under-studied proteins, more specifically, proteins unannotated within the three most commonly drug-targeted protein families: G-protein coupled receptors, ion channels, and protein kinases. Since 2014, the IDG Knowledge Management Center (IDG-KMC) has generated several open-access datasets and resources that jointly serve as a highly translational machine-learning-ready knowledgebase focused on human protein-coding genes and their products. The goal of the IDG-KMC is to develop comprehensive integrated knowledge for the druggable genome to illuminate the uncharacterized or poorly annotated portion of the druggable genome. The tools derived from the IDG-KMC provide either user-friendly visualizations or ways to impute the knowledge about potential targets using machine learning strategies. In the following protocols, we describe how to use each web-based tool to accelerate illumination in under-studied proteins. © 2022 The Authors. Current Protocols published by Wiley Periodicals LLC. Basic Protocol 1: Interacting with the Pharos user interface Basic Protocol 2: Accessing the data in Harmonizome Basic Protocol 3: The ARCHS4 resource Basic Protocol 4: Making predictions about gene function with PrismExp Basic Protocol 5: Using Geneshot to illuminate knowledge about under-studied targets Basic Protocol 6: Exploring under-studied targets with TIN-X Basic Protocol 7: Interacting with the DrugCentral user interface Basic Protocol 8: Estimating Anti-SARS-CoV-2 activities with DrugCentral REDIAL-2020 Basic Protocol 9: Drug Set Enrichment Analysis using Drugmonizome Basic Protocol 10: The Drugmonizome-ML Appyter Basic Protocol 11: The Harmonizome-ML Appyter Basic Protocol 12: GWAS target illumination with TIGA Basic Protocol 13: Prioritizing kinases for lists of proteins and phosphoproteins with KEA3 Basic Protocol 14: Converting PubMed searches to drug sets with the DrugShot Appyter.

Assuntos

Bases de Dados Genéticas , Genoma , COVID-19 , Humanos , Aprendizado de Máquina , Proteínas , SARS-CoV-2

10.

Scientific evidence based rare disease research discovery with research funding data in knowledge graph.

Zhu, Qian; Nguyen, Ðac-Trung; Sheils, Timothy; Alyea, Gioconda; Sid, Eric; Xu, Yanji; Dickens, James; Mathé, Ewy A; Pariser, Anne.

Orphanet J Rare Dis ; 16(1): 483, 2021 11 18.

Artigo em Inglês | MEDLINE | ID: mdl-34794473

RESUMO

BACKGROUND: Limited knowledge and unclear underlying biology of many rare diseases pose significant challenges to patients, clinicians, and scientists. To address these challenges, there is an urgent need to inspire and encourage scientists to propose and pursue innovative research studies that aim to uncover the genetic and molecular causes of more rare diseases and ultimately to identify effective therapeutic solutions. A clear understanding of current research efforts, knowledge/research gaps, and funding patterns as scientific evidence is crucial to systematically accelerate the pace of research discovery in rare diseases, which is an overarching goal of this study. METHODS: To semantically represent NIH funding data for rare diseases and advance its use of effectively promoting rare disease research, we identified NIH funded projects for rare diseases by mapping GARD diseases to the project based on project titles; subsequently we presented and managed those identified projects in a knowledge graph using Neo4j software, hosted at NCATS, based on a pre-defined data model that captures semantics among the data. With this developed knowledge graph, we were able to perform several case studies to demonstrate scientific evidence generation for supporting rare disease research discovery. RESULTS: Of 5001 rare diseases belonging to 32 distinct disease categories, we identified 1294 diseases that are mapped to 45,647 distinct, NIH-funded projects obtained from the NIH ExPORTER by implementing semantic annotation of project titles. To capture semantic relationships presenting amongst mapped research funding data, we defined a data model comprised of seven primary classes and corresponding object and data properties. A Neo4j knowledge graph based on this predefined data model has been developed, and we performed multiple case studies over this knowledge graph to demonstrate its use in directing and promoting rare disease research. CONCLUSION: We developed an integrative knowledge graph with rare disease funding data and demonstrated its use as a source from where we can effectively identify and generate scientific evidence to support rare disease research. With the success of this preliminary study, we plan to implement advanced computational approaches for analyzing more funding related data, e.g., project abstracts and PubMed article abstracts, and linking to other types of biomedical data to perform more sophisticated research gap analysis and identify opportunities for future research in rare diseases.

Assuntos

Pesquisa Biomédica , Doenças Raras , Humanos , Reconhecimento Automatizado de Padrão

11.

SmartGraph: a network pharmacology investigation platform.

Zahoránszky-Kohalmi, Gergely; Sheils, Timothy; Oprea, Tudor I.

J Cheminform ; 12(1): 5, 2020 Jan 21.

Artigo em Inglês | MEDLINE | ID: mdl-33430980

RESUMO

MOTIVATION: Drug discovery investigations need to incorporate network pharmacology concepts while navigating the complex landscape of drug-target and target-target interactions. This task requires solutions that integrate high-quality biomedical data, combined with analytic and predictive workflows as well as efficient visualization. SmartGraph is an innovative platform that utilizes state-of-the-art technologies such as a Neo4j graph-database, Angular web framework, RxJS asynchronous event library and D3 visualization to accomplish these goals. RESULTS: The SmartGraph framework integrates high quality bioactivity data and biological pathway information resulting in a knowledgebase comprised of 420,526 unique compound-target interactions defined between 271,098 unique compounds and 2018 targets. SmartGraph then performs bioactivity predictions based on the 63,783 Bemis-Murcko scaffolds extracted from these compounds. Through several use-cases, we illustrate the use of SmartGraph to generate hypotheses for elucidating mechanism-of-action, drug-repurposing and off-target prediction. AVAILABILITY: https://smartgraph.ncats.io/.

12.

How to Illuminate the Druggable Genome Using Pharos.

Sheils, Timothy; Mathias, Stephen L; Siramshetty, Vishal B; Bocci, Giovanni; Bologa, Cristian G; Yang, Jeremy J; Waller, Anna; Southall, Noel; Nguyen, Dac-Trung; Oprea, Tudor I.

Curr Protoc Bioinformatics ; 69(1): e92, 2020 03.

Artigo em Inglês | MEDLINE | ID: mdl-31898878

RESUMO

Pharos is an integrated web-based informatics platform for the analysis of data aggregated by the Illuminating the Druggable Genome (IDG) Knowledge Management Center, an NIH Common Fund initiative. The current version of Pharos (as of October 2019) spans 20,244 proteins in the human proteome, 19,880 disease and phenotype associations, and 226,829 ChEMBL compounds. This resource not only collates and analyzes data from over 60 high-quality resources to generate these types, but also uses text indexing to find less apparent connections between targets, and has recently begun to collaborate with institutions that generate data and resources. Proteins are ranked according to a knowledge-based classification system, which can help researchers to identify less studied "dark" targets that could be potentially further illuminated. This is an important process for both drug discovery and target validation, as more knowledge can accelerate target identification, and previously understudied proteins can serve as novel targets in drug discovery. Two basic protocols illustrate the levels of detail available for targets and several methods of finding targets of interest. An Alternate Protocol illustrates the difference in available knowledge between less and more studied targets. © 2020 by John Wiley & Sons, Inc. Basic Protocol 1: Search for a target and view details Alternate Protocol: Search for dark target and view details Basic Protocol 2: Filter a target list to get refined results.

Assuntos

Descoberta de Drogas , Genoma , Software , Neoplasias da Mama/genética , Sistemas de Liberação de Medicamentos , Feminino , Estudo de Associação Genômica Ampla , Humanos , Ligantes , Receptores Acoplados a Proteínas G/metabolismo

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA