Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 28.267
Filtrar
Mais filtros

Intervalo de ano de publicação
1.
Cell ; 187(3): 526-544, 2024 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-38306980

RESUMO

Methods from artificial intelligence (AI) trained on large datasets of sequences and structures can now "write" proteins with new shapes and molecular functions de novo, without starting from proteins found in nature. In this Perspective, I will discuss the state of the field of de novo protein design at the juncture of physics-based modeling approaches and AI. New protein folds and higher-order assemblies can be designed with considerable experimental success rates, and difficult problems requiring tunable control over protein conformations and precise shape complementarity for molecular recognition are coming into reach. Emerging approaches incorporate engineering principles-tunability, controllability, and modularity-into the design process from the beginning. Exciting frontiers lie in deconstructing cellular functions with de novo proteins and, conversely, constructing synthetic cellular signaling from the ground up. As methods improve, many more challenges are unsolved.


Assuntos
Inteligência Artificial , Proteínas , Conformação Proteica , Proteínas/química , Proteínas/metabolismo , Engenharia de Proteínas , Aprendizado Profundo
2.
Cell ; 2024 Oct 03.
Artigo em Inglês | MEDLINE | ID: mdl-39389057

RESUMO

Current metagenomic tools can fail to identify highly divergent RNA viruses. We developed a deep learning algorithm, termed LucaProt, to discover highly divergent RNA-dependent RNA polymerase (RdRP) sequences in 10,487 metatranscriptomes generated from diverse global ecosystems. LucaProt integrates both sequence and predicted structural information, enabling the accurate detection of RdRP sequences. Using this approach, we identified 161,979 potential RNA virus species and 180 RNA virus supergroups, including many previously poorly studied groups, as well as RNA virus genomes of exceptional length (up to 47,250 nucleotides) and genomic complexity. A subset of these novel RNA viruses was confirmed by RT-PCR and RNA/DNA sequencing. Newly discovered RNA viruses were present in diverse environments, including air, hot springs, and hydrothermal vents, with virus diversity and abundance varying substantially among ecosystems. This study advances virus discovery, highlights the scale of the virosphere, and provides computational tools to better document the global RNA virome.

3.
Cell ; 186(22): 4868-4884.e12, 2023 10 26.
Artigo em Inglês | MEDLINE | ID: mdl-37863056

RESUMO

Single-cell analysis in living humans is essential for understanding disease mechanisms, but it is impractical in non-regenerative organs, such as the eye and brain, because tissue biopsies would cause serious damage. We resolve this problem by integrating proteomics of liquid biopsies with single-cell transcriptomics from all known ocular cell types to trace the cellular origin of 5,953 proteins detected in the aqueous humor. We identified hundreds of cell-specific protein markers, including for individual retinal cell types. Surprisingly, our results reveal that retinal degeneration occurs in Parkinson's disease, and the cells driving diabetic retinopathy switch with disease stage. Finally, we developed artificial intelligence (AI) models to assess individual cellular aging and found that many eye diseases not associated with chronological age undergo accelerated molecular aging of disease-specific cell types. Our approach, which can be applied to other organ systems, has the potential to transform molecular diagnostics and prognostics while uncovering new cellular disease and aging mechanisms.


Assuntos
Envelhecimento , Humor Aquoso , Inteligência Artificial , Biópsia Líquida , Proteômica , Humanos , Envelhecimento/metabolismo , Humor Aquoso/química , Biópsia , Doença de Parkinson/diagnóstico
4.
Cell ; 186(7): 1328-1336.e10, 2023 03 30.
Artigo em Inglês | MEDLINE | ID: mdl-37001499

RESUMO

Stressed plants show altered phenotypes, including changes in color, smell, and shape. Yet, airborne sounds emitted by stressed plants have not been investigated before. Here we show that stressed plants emit airborne sounds that can be recorded from a distance and classified. We recorded ultrasonic sounds emitted by tomato and tobacco plants inside an acoustic chamber, and in a greenhouse, while monitoring the plant's physiological parameters. We developed machine learning models that succeeded in identifying the condition of the plants, including dehydration level and injury, based solely on the emitted sounds. These informative sounds may also be detectable by other organisms. This work opens avenues for understanding plants and their interactions with the environment and may have significant impact on agriculture.


Assuntos
Plantas , Som , Estresse Fisiológico
5.
Cell ; 185(21): 4008-4022.e14, 2022 10 13.
Artigo em Inglês | MEDLINE | ID: mdl-36150393

RESUMO

The continual evolution of SARS-CoV-2 and the emergence of variants that show resistance to vaccines and neutralizing antibodies threaten to prolong the COVID-19 pandemic. Selection and emergence of SARS-CoV-2 variants are driven in part by mutations within the viral spike protein and in particular the ACE2 receptor-binding domain (RBD), a primary target site for neutralizing antibodies. Here, we develop deep mutational learning (DML), a machine-learning-guided protein engineering technology, which is used to investigate a massive sequence space of combinatorial mutations, representing billions of RBD variants, by accurately predicting their impact on ACE2 binding and antibody escape. A highly diverse landscape of possible SARS-CoV-2 variants is identified that could emerge from a multitude of evolutionary trajectories. DML may be used for predictive profiling on current and prospective variants, including highly mutated variants such as Omicron, thus guiding the development of therapeutic antibody treatments and vaccines for COVID-19.


Assuntos
Enzima de Conversão de Angiotensina 2/metabolismo , COVID-19 , SARS-CoV-2 , Glicoproteína da Espícula de Coronavírus/metabolismo , Enzima de Conversão de Angiotensina 2/química , Enzima de Conversão de Angiotensina 2/genética , Anticorpos Neutralizantes , Anticorpos Antivirais , Vacinas contra COVID-19 , Humanos , Mutação , Pandemias , Ligação Proteica , SARS-CoV-2/genética , Glicoproteína da Espícula de Coronavírus/química , Glicoproteína da Espícula de Coronavírus/genética
6.
Cell ; 183(2): 335-346.e13, 2020 10 15.
Artigo em Inglês | MEDLINE | ID: mdl-33035452

RESUMO

Muscle spasticity after nervous system injuries and painful low back spasm affect more than 10% of global population. Current medications are of limited efficacy and cause neurological and cardiovascular side effects because they target upstream regulators of muscle contraction. Direct myosin inhibition could provide optimal muscle relaxation; however, targeting skeletal myosin is particularly challenging because of its similarity to the cardiac isoform. We identified a key residue difference between these myosin isoforms, located in the communication center of the functional regions, which allowed us to design a selective inhibitor, MPH-220. Mutagenic analysis and the atomic structure of MPH-220-bound skeletal muscle myosin confirmed the mechanism of specificity. Targeting skeletal muscle myosin by MPH-220 enabled muscle relaxation, in human and model systems, without cardiovascular side effects and improved spastic gait disorders after brain injury in a disease model. MPH-220 provides a potential nervous-system-independent option to treat spasticity and muscle stiffness.


Assuntos
Músculo Esquelético/metabolismo , Miosinas de Músculo Esquelético/efeitos dos fármacos , Miosinas de Músculo Esquelético/genética , Adulto , Animais , Miosinas Cardíacas/genética , Miosinas Cardíacas/metabolismo , Linhagem Celular , Sistemas de Liberação de Medicamentos , Feminino , Humanos , Masculino , Camundongos , Contração Muscular/fisiologia , Fibras Musculares Esqueléticas/fisiologia , Espasticidade Muscular/genética , Espasticidade Muscular/fisiopatologia , Músculo Esquelético/fisiologia , Miosinas/efeitos dos fármacos , Miosinas/genética , Miosinas/metabolismo , Isoformas de Proteínas , Ratos , Ratos Wistar , Miosinas de Músculo Esquelético/metabolismo
7.
Cell ; 176(3): 535-548.e24, 2019 01 24.
Artigo em Inglês | MEDLINE | ID: mdl-30661751

RESUMO

The splicing of pre-mRNAs into mature transcripts is remarkable for its precision, but the mechanisms by which the cellular machinery achieves such specificity are incompletely understood. Here, we describe a deep neural network that accurately predicts splice junctions from an arbitrary pre-mRNA transcript sequence, enabling precise prediction of noncoding genetic variants that cause cryptic splicing. Synonymous and intronic mutations with predicted splice-altering consequence validate at a high rate on RNA-seq and are strongly deleterious in the human population. De novo mutations with predicted splice-altering consequence are significantly enriched in patients with autism and intellectual disability compared to healthy controls and validate against RNA-seq in 21 out of 28 of these patients. We estimate that 9%-11% of pathogenic mutations in patients with rare genetic disorders are caused by this previously underappreciated class of disease variation.


Assuntos
Previsões/métodos , Precursores de RNA/genética , Splicing de RNA/genética , Algoritmos , Processamento Alternativo/genética , Transtorno Autístico/genética , Aprendizado Profundo , Éxons/genética , Humanos , Deficiência Intelectual/genética , Íntrons/genética , Redes Neurais de Computação , Precursores de RNA/metabolismo , Sítios de Splice de RNA/genética , Sítios de Splice de RNA/fisiologia
8.
Cell ; 172(5): 1122-1131.e9, 2018 02 22.
Artigo em Inglês | MEDLINE | ID: mdl-29474911

RESUMO

The implementation of clinical-decision support algorithms for medical imaging faces challenges with reliability and interpretability. Here, we establish a diagnostic tool based on a deep-learning framework for the screening of patients with common treatable blinding retinal diseases. Our framework utilizes transfer learning, which trains a neural network with a fraction of the data of conventional approaches. Applying this approach to a dataset of optical coherence tomography images, we demonstrate performance comparable to that of human experts in classifying age-related macular degeneration and diabetic macular edema. We also provide a more transparent and interpretable diagnosis by highlighting the regions recognized by the neural network. We further demonstrate the general applicability of our AI system for diagnosis of pediatric pneumonia using chest X-ray images. This tool may ultimately aid in expediting the diagnosis and referral of these treatable conditions, thereby facilitating earlier treatment, resulting in improved clinical outcomes. VIDEO ABSTRACT.


Assuntos
Aprendizado Profundo , Diagnóstico por Imagem , Pneumonia/diagnóstico , Criança , Humanos , Redes Neurais de Computação , Pneumonia/diagnóstico por imagem , Curva ROC , Reprodutibilidade dos Testes , Tomografia de Coerência Óptica
9.
Physiol Rev ; 103(4): 2423-2450, 2023 10 01.
Artigo em Inglês | MEDLINE | ID: mdl-37104717

RESUMO

Artificial intelligence in health care has experienced remarkable innovation and progress in the last decade. Significant advancements can be attributed to the utilization of artificial intelligence to transform physiology data to advance health care. In this review, we explore how past work has shaped the field and defined future challenges and directions. In particular, we focus on three areas of development. First, we give an overview of artificial intelligence, with special attention to the most relevant artificial intelligence models. We then detail how physiology data have been harnessed by artificial intelligence to advance the main areas of health care: automating existing health care tasks, increasing access to care, and augmenting health care capabilities. Finally, we discuss emerging concerns surrounding the use of individual physiology data and detail an increasingly important consideration for the field, namely the challenges of deploying artificial intelligence models to achieve meaningful clinical impact.


Assuntos
Inteligência Artificial , Atenção à Saúde , Humanos
10.
Trends Biochem Sci ; 48(12): 1014-1018, 2023 12.
Artigo em Inglês | MEDLINE | ID: mdl-37833131

RESUMO

Generative artificial intelligence (AI) is a burgeoning field with widespread applications, including in science. Here, we explore two paradigms that provide insight into the capabilities and limitations of Chat Generative Pre-trained Transformer (ChatGPT): its ability to (i) define a core biological concept (the Central Dogma of molecular biology); and (ii) interpret the genetic code.


Assuntos
Inteligência Artificial , Código Genético , Biologia Molecular
11.
Annu Rev Pharmacol Toxicol ; 64: 159-170, 2024 Jan 23.
Artigo em Inglês | MEDLINE | ID: mdl-37562495

RESUMO

Health digital twins (HDTs) are virtual representations of real individuals that can be used to simulate human physiology, disease, and drug effects. HDTs can be used to improve drug discovery and development by providing a data-driven approach to inform target selection, drug delivery, and design of clinical trials. HDTs also offer new applications into precision therapies and clinical decision making. The deployment of HDTs at scale could bring a precision approach to public health monitoring and intervention. Next steps include challenges such as addressing socioeconomic barriers and ensuring the representativeness of the technology based on the training and validation data sets. Governance and regulation of HDT technology are still in the early stages.


Assuntos
Disciplinas das Ciências Biológicas , Humanos , Sistemas de Liberação de Medicamentos , Descoberta de Drogas , Tecnologia , Atenção à Saúde
12.
Trends Genet ; 40(5): 383-386, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38637270

RESUMO

Artificial intelligence (AI) in omics analysis raises privacy threats to patients. Here, we briefly discuss risk factors to patient privacy in data sharing, model training, and release, as well as methods to safeguard and evaluate patient privacy in AI-driven omics methods.


Assuntos
Inteligência Artificial , Genômica , Humanos , Genômica/métodos , Privacidade , Disseminação de Informação
13.
Trends Genet ; 40(10): 891-908, 2024 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-39117482

RESUMO

Harnessing cutting-edge technologies to enhance crop productivity is a pivotal goal in modern plant breeding. Artificial intelligence (AI) is renowned for its prowess in big data analysis and pattern recognition, and is revolutionizing numerous scientific domains including plant breeding. We explore the wider potential of AI tools in various facets of breeding, including data collection, unlocking genetic diversity within genebanks, and bridging the genotype-phenotype gap to facilitate crop breeding. This will enable the development of crop cultivars tailored to the projected future environments. Moreover, AI tools also hold promise for refining crop traits by improving the precision of gene-editing systems and predicting the potential effects of gene variants on plant phenotypes. Leveraging AI-enabled precision breeding can augment the efficiency of breeding programs and holds promise for optimizing cropping systems at the grassroots level. This entails identifying optimal inter-cropping and crop-rotation models to enhance agricultural sustainability and productivity in the field.


Assuntos
Inteligência Artificial , Produtos Agrícolas , Melhoramento Vegetal , Melhoramento Vegetal/métodos , Produtos Agrícolas/genética , Produtos Agrícolas/crescimento & desenvolvimento , Fenótipo , Variação Genética , Edição de Genes/métodos , Genótipo
14.
Annu Rev Genomics Hum Genet ; 25(1): 141-159, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-38724019

RESUMO

Significant progress has been made in augmenting clinical decision-making using artificial intelligence (AI) in the context of secondary and tertiary care at large academic medical centers. For such innovations to have an impact across the spectrum of care, additional challenges must be addressed, including inconsistent use of preventative care and gaps in chronic care management. The integration of additional data, including genomics and data from wearables, could prove critical in addressing these gaps, but technical, legal, and ethical challenges arise. On the technical side, approaches for integrating complex and messy data are needed. Data and design imperfections like selection bias, missing data, and confounding must be addressed. In terms of legal and ethical challenges, while AI has the potential to aid in leveraging patient data to make clinical care decisions, we also risk exacerbating existing disparities. Organizations implementing AI solutions must carefully consider how they can improve care for all and reduce inequities.


Assuntos
Inteligência Artificial , Medicina de Precisão , Humanos , Tomada de Decisão Clínica , Genômica/métodos
15.
Am J Hum Genet ; 111(9): 1819-1833, 2024 Sep 05.
Artigo em Inglês | MEDLINE | ID: mdl-39146935

RESUMO

Large language models (LLMs) are generating interest in medical settings. For example, LLMs can respond coherently to medical queries by providing plausible differential diagnoses based on clinical notes. However, there are many questions to explore, such as evaluating differences between open- and closed-source LLMs as well as LLM performance on queries from both medical and non-medical users. In this study, we assessed multiple LLMs, including Llama-2-chat, Vicuna, Medllama2, Bard/Gemini, Claude, ChatGPT3.5, and ChatGPT-4, as well as non-LLM approaches (Google search and Phenomizer) regarding their ability to identify genetic conditions from textbook-like clinician questions and their corresponding layperson translations related to 63 genetic conditions. For open-source LLMs, larger models were more accurate than smaller LLMs: 7b, 13b, and larger than 33b parameter models obtained accuracy ranges from 21%-49%, 41%-51%, and 54%-68%, respectively. Closed-source LLMs outperformed open-source LLMs, with ChatGPT-4 performing best (89%-90%). Three of 11 LLMs and Google search had significant performance gaps between clinician and layperson prompts. We also evaluated how in-context prompting and keyword removal affected open-source LLM performance. Models were provided with 2 types of in-context prompts: list-type prompts, which improved LLM performance, and definition-type prompts, which did not. We further analyzed removal of rare terms from descriptions, which decreased accuracy for 5 of 7 evaluated LLMs. Finally, we observed much lower performance with real individuals' descriptions; LLMs answered these questions with a maximum 21% accuracy.


Assuntos
Autorrelato , Humanos , Idioma , Doenças Genéticas Inatas/genética
16.
Am J Hum Genet ; 111(10): 2190-2202, 2024 Oct 03.
Artigo em Inglês | MEDLINE | ID: mdl-39255797

RESUMO

Phenotype-driven gene prioritization is fundamental to diagnosing rare genetic disorders. While traditional approaches rely on curated knowledge graphs with phenotype-gene relations, recent advancements in large language models (LLMs) promise a streamlined text-to-gene solution. In this study, we evaluated five LLMs, including two generative pre-trained transformers (GPT) series and three Llama2 series, assessing their performance across task completeness, gene prediction accuracy, and adherence to required output structures. We conducted experiments, exploring various combinations of models, prompts, phenotypic input types, and task difficulty levels. Our findings revealed that the best-performed LLM, GPT-4, achieved an average accuracy of 17.0% in identifying diagnosed genes within the top 50 predictions, which still falls behind traditional tools. However, accuracy increased with the model size. Consistent results were observed over time, as shown in the dataset curated after 2023. Advanced techniques such as retrieval-augmented generation (RAG) and few-shot learning did not improve the accuracy. Sophisticated prompts were more likely to enhance task completeness, especially in smaller models. Conversely, complicated prompts tended to decrease output structure compliance rate. LLMs also achieved better-than-random prediction accuracy with free-text input, though performance was slightly lower than with standardized concept input. Bias analysis showed that highly cited genes, such as BRCA1, TP53, and PTEN, are more likely to be predicted. Our study provides valuable insights into integrating LLMs with genomic analysis, contributing to the ongoing discussion on their utilization in clinical workflows.


Assuntos
Fenótipo , Doenças Raras , Humanos , Doenças Raras/genética , Biologia Computacional/métodos
17.
Proc Natl Acad Sci U S A ; 121(16): e2303165121, 2024 Apr 16.
Artigo em Inglês | MEDLINE | ID: mdl-38607932

RESUMO

Antimicrobial resistance was estimated to be associated with 4.95 million deaths worldwide in 2019. It is possible to frame the antimicrobial resistance problem as a feedback-control problem. If we could optimize this feedback-control problem and translate our findings to the clinic, we could slow, prevent, or reverse the development of high-level drug resistance. Prior work on this topic has relied on systems where the exact dynamics and parameters were known a priori. In this study, we extend this work using a reinforcement learning (RL) approach capable of learning effective drug cycling policies in a system defined by empirically measured fitness landscapes. Crucially, we show that it is possible to learn effective drug cycling policies despite the problems of noisy, limited, or delayed measurement. Given access to a panel of 15 [Formula: see text]-lactam antibiotics with which to treat the simulated Escherichia coli population, we demonstrate that RL agents outperform two naive treatment paradigms at minimizing the population fitness over time. We also show that RL agents approach the performance of the optimal drug cycling policy. Even when stochastic noise is introduced to the measurements of population fitness, we show that RL agents are capable of maintaining evolving populations at lower growth rates compared to controls. We further tested our approach in arbitrary fitness landscapes of up to 1,024 genotypes. We show that minimization of population fitness using drug cycles is not limited by increasing genome size. Our work represents a proof-of-concept for using AI to control complex evolutionary processes.


Assuntos
Anti-Infecciosos , Aprendizagem , Reforço Psicológico , Resistência Microbiana a Medicamentos , Ciclismo , Escherichia coli/genética
18.
Proc Natl Acad Sci U S A ; 121(41): e2322420121, 2024 Oct 08.
Artigo em Inglês | MEDLINE | ID: mdl-39365822

RESUMO

The widespread adoption of large language models (LLMs) makes it important to recognize their strengths and limitations. We argue that to develop a holistic understanding of these systems, we must consider the problem that they were trained to solve: next-word prediction over Internet text. By recognizing the pressures that this task exerts, we can make predictions about the strategies that LLMs will adopt, allowing us to reason about when they will succeed or fail. Using this approach-which we call the teleological approach-we identify three factors that we hypothesize will influence LLM accuracy: the probability of the task to be performed, the probability of the target output, and the probability of the provided input. To test our predictions, we evaluate five LLMs (GPT-3.5, GPT-4, Claude 3, Llama 3, and Gemini 1.0) on 11 tasks, and we find robust evidence that LLMs are influenced by probability in the hypothesized ways. Many of the experiments reveal surprising failure modes. For instance, GPT-4's accuracy at decoding a simple cipher is 51% when the output is a high-probability sentence but only 13% when it is low-probability, even though this task is a deterministic one for which probability should not matter. These results show that AI practitioners should be careful about using LLMs in low-probability situations. More broadly, we conclude that we should not evaluate LLMs as if they are humans but should instead treat them as a distinct type of system-one that has been shaped by its own particular set of pressures.


Assuntos
Idioma , Humanos , Modelos Teóricos
19.
Proc Natl Acad Sci U S A ; 121(18): e2307304121, 2024 Apr 30.
Artigo em Inglês | MEDLINE | ID: mdl-38640257

RESUMO

Over the past few years, machine learning models have significantly increased in size and complexity, especially in the area of generative AI such as large language models. These models require massive amounts of data and compute capacity to train, to the extent that concerns over the training data (such as protected or private content) cannot be practically addressed by retraining the model "from scratch" with the questionable data removed or altered. Furthermore, despite significant efforts and controls dedicated to ensuring that training corpora are properly curated and composed, the sheer volume required makes it infeasible to manually inspect each datum comprising a training corpus. One potential approach to training corpus data defects is model disgorgement, by which we broadly mean the elimination or reduction of not only any improperly used data, but also the effects of improperly used data on any component of an ML model. Model disgorgement techniques can be used to address a wide range of issues, such as reducing bias or toxicity, increasing fidelity, and ensuring responsible use of intellectual property. In this paper, we survey the landscape of model disgorgement methods and introduce a taxonomy of disgorgement techniques that are applicable to modern ML systems. In particular, we investigate the various meanings of "removing the effects" of data on the trained model in a way that does not require retraining from scratch.


Assuntos
Idioma , Aprendizado de Máquina
20.
Hum Mol Genet ; 33(15): 1367-1377, 2024 Jul 22.
Artigo em Inglês | MEDLINE | ID: mdl-38704739

RESUMO

Spinal Muscular Atrophy is caused by partial loss of survival of motoneuron (SMN) protein expression. The numerous interaction partners and mechanisms influenced by SMN loss result in a complex disease. Current treatments restore SMN protein levels to a certain extent, but do not cure all symptoms. The prolonged survival of patients creates an increasing need for a better understanding of SMA. Although many SMN-protein interactions, dysregulated pathways, and organ phenotypes are known, the connections among them remain largely unexplored. Monogenic diseases are ideal examples for the exploration of cause-and-effect relationships to create a network describing the disease-context. Machine learning tools can utilize such knowledge to analyze similarities between disease-relevant molecules and molecules not described in the disease so far. We used an artificial intelligence-based algorithm to predict new genes of interest. The transcriptional regulation of 8 out of 13 molecules selected from the predicted set were successfully validated in an SMA mouse model. This bioinformatic approach, using the given experimental knowledge for relevance predictions, enhances efficient targeted research in SMA and potentially in other disease settings.


Assuntos
Inteligência Artificial , Biologia Computacional , Modelos Animais de Doenças , Atrofia Muscular Espinal , Atrofia Muscular Espinal/genética , Atrofia Muscular Espinal/metabolismo , Animais , Camundongos , Humanos , Biologia Computacional/métodos , Proteína 1 de Sobrevivência do Neurônio Motor/genética , Proteína 1 de Sobrevivência do Neurônio Motor/metabolismo , Aprendizado de Máquina , Algoritmos , Regulação da Expressão Gênica/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA