Pesquisa | BVS Doenças Infecciosas e Parasitárias

1.

Scientific discovery in the age of artificial intelligence.

Wang, Hanchen; Fu, Tianfan; Du, Yuanqi; Gao, Wenhao; Huang, Kexin; Liu, Ziming; Chandak, Payal; Liu, Shengchao; Van Katwyk, Peter; Deac, Andreea; Anandkumar, Anima; Bergen, Karianne; Gomes, Carla P; Ho, Shirley; Kohli, Pushmeet; Lasenby, Joan; Leskovec, Jure; Liu, Tie-Yan; Manrai, Arjun; Marks, Debora; Ramsundar, Bharath; Song, Le; Sun, Jimeng; Tang, Jian; Velickovic, Petar; Welling, Max; Zhang, Linfeng; Coley, Connor W; Bengio, Yoshua; Zitnik, Marinka.

Nature ; 620(7972): 47-60, 2023 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-37532811

RESUMO

Artificial intelligence (AI) is being increasingly integrated into scientific discovery to augment and accelerate research, helping scientists to generate hypotheses, design experiments, collect and interpret large datasets, and gain insights that might not have been possible using traditional scientific methods alone. Here we examine breakthroughs over the past decade that include self-supervised learning, which allows models to be trained on vast amounts of unlabelled data, and geometric deep learning, which leverages knowledge about the structure of scientific data to enhance model accuracy and efficiency. Generative AI methods can create designs, such as small-molecule drugs and proteins, by analysing diverse data modalities, including images and sequences. We discuss how these methods can help scientists throughout the scientific process and the central issues that remain despite such advances. Both developers and users of AI toolsneed a better understanding of when such approaches need improvement, and challenges posed by poor data quality and stewardship remain. These issues cut across scientific disciplines and require developing foundational algorithmic approaches that can contribute to scientific understanding or acquire it autonomously, making them critical areas of focus for AI innovation.

Assuntos

Inteligência Artificial , Projetos de Pesquisa , Inteligência Artificial/normas , Inteligência Artificial/tendências , Conjuntos de Dados como Assunto , Aprendizado Profundo , Projetos de Pesquisa/normas , Projetos de Pesquisa/tendências , Aprendizado de Máquina não Supervisionado

2.

Publisher Correction: Scientific discovery in the age of artificial intelligence.

Wang, Hanchen; Fu, Tianfan; Du, Yuanqi; Gao, Wenhao; Huang, Kexin; Liu, Ziming; Chandak, Payal; Liu, Shengchao; Van Katwyk, Peter; Deac, Andreea; Anandkumar, Anima; Bergen, Karianne; Gomes, Carla P; Ho, Shirley; Kohli, Pushmeet; Lasenby, Joan; Leskovec, Jure; Liu, Tie-Yan; Manrai, Arjun; Marks, Debora; Ramsundar, Bharath; Song, Le; Sun, Jimeng; Tang, Jian; Velickovic, Petar; Welling, Max; Zhang, Linfeng; Coley, Connor W; Bengio, Yoshua; Zitnik, Marinka.

Nature ; 621(7978): E33, 2023 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-37648871

3.

DeepPurpose: a deep learning library for drug-target interaction prediction.

Huang, Kexin; Fu, Tianfan; Glass, Lucas M; Zitnik, Marinka; Xiao, Cao; Sun, Jimeng.

Bioinformatics ; 36(22-23): 5545-5547, 2021 04 01.

Artigo em Inglês | MEDLINE | ID: mdl-33275143

RESUMO

SUMMARY: Accurate prediction of drug-target interactions (DTI) is crucial for drug discovery. Recently, deep learning (DL) models for show promising performance for DTI prediction. However, these models can be difficult to use for both computer scientists entering the biomedical field and bioinformaticians with limited DL experience. We present DeepPurpose, a comprehensive and easy-to-use DL library for DTI prediction. DeepPurpose supports training of customized DTI prediction models by implementing 15 compound and protein encoders and over 50 neural architectures, along with providing many other useful features. We demonstrate state-of-the-art performance of DeepPurpose on several benchmark datasets. AVAILABILITY AND IMPLEMENTATION: https://github.com/kexinhuang12345/DeepPurpose. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Aprendizado Profundo , Preparações Farmacêuticas , Desenvolvimento de Medicamentos , Descoberta de Drogas , Proteínas

4.

MOLER: Incorporate Molecule-Level Reward to Enhance Deep Generative Model for Molecule Optimization.

Fu, Tianfan; Xiao, Cao; Glass, Lucas M; Sun, Jimeng.

IEEE Trans Knowl Data Eng ; 34(11): 5459-5471, 2022 Nov.

Artigo em Inglês | MEDLINE | ID: mdl-36590707

RESUMO

The goal of molecular optimization is to generate molecules similar to a target molecule but with better chemical properties. Deep generative models have shown great success in molecule optimization. However, due to the iterative local generation process of deep generative models, the resulting molecules can significantly deviate from the input in molecular similarity and size, leading to poor chemical properties. The key issue here is that the existing deep generative models restrict their attention on substructure-level generation without considering the entire molecule as a whole. To address this challenge, we propose Molecule-Level Reward functions (MOLER) to encourage (1) the input and the generated molecule to be similar, and to ensure (2) the generated molecule has a similar size to the input. The proposed method can be combined with various deep generative models. Policy gradient technique is introduced to optimize reward-based objectives with small computational overhead. Empirical studies show that MOLER achieves up to 20.2% relative improvement in success rate over the best baseline method on several properties, including QED, DRD2 and LogP.

5.

Artificial intelligence foundation for therapeutic science.

Huang, Kexin; Fu, Tianfan; Gao, Wenhao; Zhao, Yue; Roohani, Yusuf; Leskovec, Jure; Coley, Connor W; Xiao, Cao; Sun, Jimeng; Zitnik, Marinka.

Nat Chem Biol ; 18(10): 1033-1036, 2022 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-36131149

6.

drGAT: Attention-Guided Gene Assessment of Drug Response Utilizing a Drug-Cell-Gene Heterogeneous Network.

Inoue, Yoshitaka; Lee, Hunmin; Fu, Tianfan; Luna, Augustin.

ArXiv ; 2024 May 14.

Artigo em Inglês | MEDLINE | ID: mdl-38800657

RESUMO

Drug development is a lengthy process with a high failure rate. Increasingly, machine learning is utilized to facilitate the drug development processes. These models aim to enhance our understanding of drug characteristics, including their activity in biological contexts. However, a major challenge in drug response (DR) prediction is model interpretability as it aids in the validation of findings. This is important in biomedicine, where models need to be understandable in comparison with established knowledge of drug interactions with proteins. drGAT, a graph deep learning model, leverages a heterogeneous graph composed of relationships between proteins, cell lines, and drugs. drGAT is designed with two objectives: DR prediction as a binary sensitivity prediction and elucidation of drug mechanism from attention coefficients. drGAT has demonstrated superior performance over existing models, achieving 78% accuracy (and precision), and 76% F1 score for 269 DNA-damaging compounds of the NCI60 drug response dataset. To assess the model's interpretability, we conducted a review of drug-gene co-occurrences in Pubmed abstracts in comparison to the top 5 genes with the highest attention coefficients for each drug. We also examined whether known relationships were retained in the model by inspecting the neighborhoods of topoisomerase-related drugs. For example, our model retained TOP1 as a highly weighted predictive feature for irinotecan and topotecan, in addition to other genes that could potentially be regulators of the drugs. Our method can be used to accurately predict sensitivity to drugs and may be useful in the identification of biomarkers relating to the treatment of cancer patients.

7.

Uncertainty Quantification and Interpretability for Clinical Trial Approval Prediction.

Lu, Yingzhou; Chen, Tianyi; Hao, Nan; Van Rechem, Capucine; Chen, Jintai; Fu, Tianfan.

Health Data Sci ; 4: 0126, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-38645573

RESUMO

Background: Clinical trial is a crucial step in the development of a new therapy (e.g., medication) and is remarkably expensive and time-consuming. Forecasting the approval of clinical trials accurately would enable us to circumvent trials destined to fail, thereby allowing us to allocate more resources to therapies with better chances. However, existing approval prediction algorithms did not quantify the uncertainty and provide interpretability, limiting their usage in real-world clinical trial management. Methods: This paper quantifies uncertainty and improves interpretability in clinical trial approval predictions. We devised a selective classification approach and integrated it with the Hierarchical Interaction Network, the state-of-the-art clinical trial prediction model. Selective classification, encompassing a spectrum of methods for uncertainty quantification, empowers the model to withhold decision-making in the face of samples marked by ambiguity or low confidence. This approach not only amplifies the accuracy of predictions for the instances it chooses to classify but also notably enhances the model's interpretability. Results: Comprehensive experiments demonstrate that incorporating uncertainty markedly enhances the model's performance. Specifically, the proposed method achieved 32.37%, 21.43%, and 13.27% relative improvement on area under the precision-recall curve over the base model (Hierarchical Interaction Network) in phase I, II, and III trial approval predictions, respectively. For phase III trials, our method reaches 0.9022 area under the precision-recall curve scores. In addition, we show a case study of interpretability that helps domain experts to understand model's outcome. The code is publicly available at https://github.com/Vincent-1125/Uncertainty-Quantification-on-Clinical-Trial-Outcome-Prediction. Conclusion: Our approach not only measures model uncertainty but also greatly improves interpretability and performance for clinical trial approval prediction.

8.

TDC-2: Multimodal Foundation for Therapeutic Science.

Velez-Arce, Alejandro; Huang, Kexin; Li, Michelle M; Lin, Xiang; Gao, Wenhao; Fu, Tianfan; Kellis, Manolis; Pentelute, Bradley L; Zitnik, Marinka.

bioRxiv ; 2024 Jun 21.

Artigo em Inglês | MEDLINE | ID: mdl-38948789

RESUMO

Therapeutics Data Commons (tdcommons.ai) is an open science initiative with unified datasets, AI models, and benchmarks to support research across therapeutic modalities and drug discovery and development stages. The Commons 2.0 (TDC-2) is a comprehensive overhaul of Therapeutic Data Commons to catalyze research in multimodal models for drug discovery by unifying single-cell biology of diseases, biochemistry of molecules, and effects of drugs through multimodal datasets, AI-powered API endpoints, new multimodal tasks and model frameworks, and comprehensive benchmarks. TDC-2 introduces over 1,000 multimodal datasets spanning approximately 85 million cells, pre-calculated embeddings from 5 state-of-the-art single-cell models, and a biomedical knowledge graph. TDC-2 drastically expands the coverage of ML tasks across therapeutic pipelines and 10+ new modalities, spanning but not limited to single-cell gene expression data, clinical trial data, peptide sequence data, peptidomimetics protein-peptide interaction data regarding newly discovered ligands derived from AS-MS spectroscopy, novel 3D structural data for proteins, and cell-type-specific protein-protein interaction networks at single-cell resolution. TDC-2 introduces multimodal data access under an API-first design using the model-view-controller paradigm. TDC-2 introduces 7 novel ML tasks with fine-grained biological contexts: contextualized drug-target identification, single-cell chemical/genetic perturbation response prediction, protein-peptide binding affinity prediction task, and clinical trial outcome prediction task, which introduce antigen-processing-pathway-specific, cell-type-specific, peptide-specific, and patient-specific biological contexts. TDC-2 also releases benchmarks evaluating 15+ state-of-the-art models across 5+ new learning tasks evaluating models on diverse biological contexts and sampling approaches. Among these, TDC-2 provides the first benchmark for context-specific learning. TDC-2, to our knowledge, is also the first to introduce a protein-peptide binding interaction benchmark.

9.

HINT: Hierarchical interaction network for clinical-trial-outcome predictions.

Fu, Tianfan; Huang, Kexin; Xiao, Cao; Glass, Lucas M; Sun, Jimeng.

Patterns (N Y) ; 3(4): 100445, 2022 Apr 08.

Artigo em Inglês | MEDLINE | ID: mdl-35465223

RESUMO

Clinical trials are crucial for drug development but often face uncertain outcomes due to safety, efficacy, or patient-recruitment problems. We propose the Hierarchical Interaction Network (HINT) to predict clinical trial outcomes. First, HINT encodes multi-modal data (drug molecule, target disease, trial eligibility criteria) into embeddings. Then, HINT trains knowledge-embedding modules using drug pharmacokinetic and historical trial data. Finally, a hierarchical interaction graph connects all of the embeddings to capture their interactions and predict trial outcomes. HINT was trained and validated on 1,160 phase I trials, 4,449 phase II trials, and 3,436 phase III trials. It obtained 0.665, 0.620, and 0.847 F1 scores on separate test sets of 627 phase I, 1,653 phase II, and 1,140 phase III trials, respectively. HINT significantly outperforms the best baseline method on most metrics. The benchmark dataset and codes are released at https://github.com/futianfan/clinical-trial-outcome-prediction.

10.

DDL: Deep Dictionary Learning for Predictive Phenotyping.

Fu, Tianfan; Hoang, Trong Nghia; Xiao, Cao; Sun, Jimeng.

IJCAI (U S) ; 2019: 5857-5863, 2019 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-33767572

RESUMO

Predictive phenotyping is about accurately predicting what phenotypes will occur in the next clinical visit based on longitudinal Electronic Health Record (EHR) data. While deep learning (DL) models have recently demonstrated strong performance in predictive phenotyping, they require access to a large amount of labeled data, which are expensive to acquire. To address this label-insufficient challenge, we propose a deep dictionary learning framework (DDL) for phenotyping, which utilizes unlabeled data as a complementary source of information to generate a better, more succinct data representation. Our empirical evaluations on multiple EHR datasets demonstrated that DDL outperforms the existing predictive phenotyping methods on a wide variety of clinical tasks that require patient phenotyping. The results also show that unlabeled data can be used to generate better data representation that helps improve DDL's phenotyping performance over existing methods that only uses labeled data.

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA