Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 21.593
Filtrar
Más filtros

Tipo del documento
Intervalo de año de publicación
1.
Cell ; 187(3): 526-544, 2024 Feb 01.
Artículo en Inglés | MEDLINE | ID: mdl-38306980

RESUMEN

Methods from artificial intelligence (AI) trained on large datasets of sequences and structures can now "write" proteins with new shapes and molecular functions de novo, without starting from proteins found in nature. In this Perspective, I will discuss the state of the field of de novo protein design at the juncture of physics-based modeling approaches and AI. New protein folds and higher-order assemblies can be designed with considerable experimental success rates, and difficult problems requiring tunable control over protein conformations and precise shape complementarity for molecular recognition are coming into reach. Emerging approaches incorporate engineering principles-tunability, controllability, and modularity-into the design process from the beginning. Exciting frontiers lie in deconstructing cellular functions with de novo proteins and, conversely, constructing synthetic cellular signaling from the ground up. As methods improve, many more challenges are unsolved.


Asunto(s)
Inteligencia Artificial , Proteínas , Conformación Proteica , Proteínas/química , Proteínas/metabolismo , Ingeniería de Proteínas , Aprendizaje Profundo
2.
Cell ; 187(10): 2502-2520.e17, 2024 May 09.
Artículo en Inglés | MEDLINE | ID: mdl-38729110

RESUMEN

Human tissue, which is inherently three-dimensional (3D), is traditionally examined through standard-of-care histopathology as limited two-dimensional (2D) cross-sections that can insufficiently represent the tissue due to sampling bias. To holistically characterize histomorphology, 3D imaging modalities have been developed, but clinical translation is hampered by complex manual evaluation and lack of computational platforms to distill clinical insights from large, high-resolution datasets. We present TriPath, a deep-learning platform for processing tissue volumes and efficiently predicting clinical outcomes based on 3D morphological features. Recurrence risk-stratification models were trained on prostate cancer specimens imaged with open-top light-sheet microscopy or microcomputed tomography. By comprehensively capturing 3D morphologies, 3D volume-based prognostication achieves superior performance to traditional 2D slice-based approaches, including clinical/histopathological baselines from six certified genitourinary pathologists. Incorporating greater tissue volume improves prognostic performance and mitigates risk prediction variability from sampling bias, further emphasizing the value of capturing larger extents of heterogeneous morphology.


Asunto(s)
Imagenología Tridimensional , Neoplasias de la Próstata , Aprendizaje Automático Supervisado , Humanos , Masculino , Aprendizaje Profundo , Imagenología Tridimensional/métodos , Pronóstico , Neoplasias de la Próstata/patología , Neoplasias de la Próstata/diagnóstico por imagen , Microtomografía por Rayos X/métodos
3.
Cell ; 184(19): 5053-5069.e23, 2021 09 16.
Artículo en Inglés | MEDLINE | ID: mdl-34390642

RESUMEN

Genetic perturbations of cortical development can lead to neurodevelopmental disease, including autism spectrum disorder (ASD). To identify genomic regions crucial to corticogenesis, we mapped the activity of gene-regulatory elements generating a single-cell atlas of gene expression and chromatin accessibility both independently and jointly. This revealed waves of gene regulation by key transcription factors (TFs) across a nearly continuous differentiation trajectory, distinguished the expression programs of glial lineages, and identified lineage-determining TFs that exhibited strong correlation between linked gene-regulatory elements and expression levels. These highly connected genes adopted an active chromatin state in early differentiating cells, consistent with lineage commitment. Base-pair-resolution neural network models identified strong cell-type-specific enrichment of noncoding mutations predicted to be disruptive in a cohort of ASD individuals and identified frequently disrupted TF binding sites. This approach illustrates how cell-type-specific mapping can provide insights into the programs governing human development and disease.


Asunto(s)
Corteza Cerebral/embriología , Cromatina/metabolismo , Regulación del Desarrollo de la Expresión Génica , Análisis de la Célula Individual , Astrocitos/citología , Diferenciación Celular , Linaje de la Célula/genética , Análisis por Conglomerados , Aprendizaje Profundo , Epigénesis Genética , Lógica Difusa , Glutamatos/metabolismo , Humanos , Mutación/genética , Neuronas/metabolismo , Secuencias Reguladoras de Ácidos Nucleicos/genética
4.
Cell ; 184(7): 1865-1883.e20, 2021 04 01.
Artículo en Inglés | MEDLINE | ID: mdl-33636127

RESUMEN

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the cause of the ongoing coronavirus disease 2019 (COVID-19) pandemic. Understanding of the RNA virus and its interactions with host proteins could improve therapeutic interventions for COVID-19. By using icSHAPE, we determined the structural landscape of SARS-CoV-2 RNA in infected human cells and from refolded RNAs, as well as the regulatory untranslated regions of SARS-CoV-2 and six other coronaviruses. We validated several structural elements predicted in silico and discovered structural features that affect the translation and abundance of subgenomic viral RNAs in cells. The structural data informed a deep-learning tool to predict 42 host proteins that bind to SARS-CoV-2 RNA. Strikingly, antisense oligonucleotides targeting the structural elements and FDA-approved drugs inhibiting the SARS-CoV-2 RNA binding proteins dramatically reduced SARS-CoV-2 infection in cells derived from human liver and lung tumors. Our findings thus shed light on coronavirus and reveal multiple candidate therapeutics for COVID-19 treatment.


Asunto(s)
Tratamiento Farmacológico de COVID-19 , Reposicionamiento de Medicamentos , ARN Viral , Proteínas de Unión al ARN/antagonistas & inhibidores , SARS-CoV-2 , Animales , Línea Celular , Chlorocebus aethiops , Aprendizaje Profundo , Humanos , Conformación de Ácido Nucleico , ARN Viral/química , Proteínas de Unión al ARN/metabolismo , SARS-CoV-2/efectos de los fármacos , SARS-CoV-2/genética
5.
Cell ; 180(4): 796-812.e19, 2020 02 20.
Artículo en Inglés | MEDLINE | ID: mdl-32059778

RESUMEN

Optical tissue transparency permits scalable cellular and molecular investigation of complex tissues in 3D. Adult human organs are particularly challenging to render transparent because of the accumulation of dense and sturdy molecules in decades-aged tissues. To overcome these challenges, we developed SHANEL, a method based on a new tissue permeabilization approach to clear and label stiff human organs. We used SHANEL to render the intact adult human brain and kidney transparent and perform 3D histology with antibodies and dyes in centimeters-depth. Thereby, we revealed structural details of the intact human eye, human thyroid, human kidney, and transgenic pig pancreas at the cellular resolution. Furthermore, we developed a deep learning pipeline to analyze millions of cells in cleared human brain tissues within hours with standard lab computers. Overall, SHANEL is a robust and unbiased technology to chart the cellular and molecular architecture of large intact mammalian organs.


Asunto(s)
Aprendizaje Profundo , Imagenología Tridimensional/métodos , Imagen Óptica/métodos , Coloración y Etiquetado/métodos , Anciano de 80 o más Años , Animales , Encéfalo/diagnóstico por imagen , Ojo/diagnóstico por imagen , Femenino , Humanos , Imagenología Tridimensional/normas , Riñón/diagnóstico por imagen , Límite de Detección , Masculino , Ratones , Persona de Mediana Edad , Imagen Óptica/normas , Páncreas/diagnóstico por imagen , Coloración y Etiquetado/normas , Porcinos , Glándula Tiroides/diagnóstico por imagen
6.
Nat Rev Mol Cell Biol ; 23(1): 40-55, 2022 01.
Artículo en Inglés | MEDLINE | ID: mdl-34518686

RESUMEN

The expanding scale and inherent complexity of biological data have encouraged a growing use of machine learning in biology to build informative and predictive models of the underlying biological processes. All machine learning techniques fit models to data; however, the specific methods are quite varied and can at first glance seem bewildering. In this Review, we aim to provide readers with a gentle introduction to a few key machine learning techniques, including the most recently developed and widely used techniques involving deep neural networks. We describe how different techniques may be suited to specific types of biological data, and also discuss some best practices and points to consider when one is embarking on experiments involving machine learning. Some emerging directions in machine learning methodology are also discussed.


Asunto(s)
Biología , Aprendizaje Automático , Animales , Aprendizaje Profundo , Humanos , Redes Neurales de la Computación
7.
Cell ; 176(3): 414-416, 2019 01 24.
Artículo en Inglés | MEDLINE | ID: mdl-30682368

RESUMEN

The importance of genomic sequence context in generating transcriptome diversity through RNA splicing is independently unmasked by two studies in this issue (Jaganathan et al., 2019; Baeza-Centurion et al., 2019).


Asunto(s)
Aprendizaje Profundo , Empalme del ARN , Genoma , Genómica , Transcriptoma
8.
Cell ; 178(1): 91-106.e23, 2019 06 27.
Artículo en Inglés | MEDLINE | ID: mdl-31178116

RESUMEN

Alternative polyadenylation (APA) is a major driver of transcriptome diversity in human cells. Here, we use deep learning to predict APA from DNA sequence alone. We trained our model (APARENT, APA REgression NeT) on isoform expression data from over 3 million APA reporters. APARENT's predictions are highly accurate when tasked with inferring APA in synthetic and human 3'UTRs. Visualizing features learned across all network layers reveals that APARENT recognizes sequence motifs known to recruit APA regulators, discovers previously unknown sequence determinants of 3' end processing, and integrates these features into a comprehensive, interpretable, cis-regulatory code. We apply APARENT to forward engineer functional polyadenylation signals with precisely defined cleavage position and isoform usage and validate predictions experimentally. Finally, we use APARENT to quantify the impact of genetic variants on APA. Our approach detects pathogenic variants in a wide range of disease contexts, expanding our understanding of the genetic origins of disease.


Asunto(s)
Aprendizaje Profundo , Modelos Genéticos , Poliadenilación/genética , Regiones no Traducidas 3'/genética , Secuencia de Bases/genética , Bases de Datos Genéticas , Expresión Génica/genética , Células HEK293 , Humanos , Mutagénesis/genética , División del ARN/genética , ARN Mensajero/genética , RNA-Seq , Biología Sintética , Transcriptoma
9.
Cell ; 176(3): 535-548.e24, 2019 01 24.
Artículo en Inglés | MEDLINE | ID: mdl-30661751

RESUMEN

The splicing of pre-mRNAs into mature transcripts is remarkable for its precision, but the mechanisms by which the cellular machinery achieves such specificity are incompletely understood. Here, we describe a deep neural network that accurately predicts splice junctions from an arbitrary pre-mRNA transcript sequence, enabling precise prediction of noncoding genetic variants that cause cryptic splicing. Synonymous and intronic mutations with predicted splice-altering consequence validate at a high rate on RNA-seq and are strongly deleterious in the human population. De novo mutations with predicted splice-altering consequence are significantly enriched in patients with autism and intellectual disability compared to healthy controls and validate against RNA-seq in 21 out of 28 of these patients. We estimate that 9%-11% of pathogenic mutations in patients with rare genetic disorders are caused by this previously underappreciated class of disease variation.


Asunto(s)
Predicción/métodos , Precursores del ARN/genética , Empalme del ARN/genética , Algoritmos , Empalme Alternativo/genética , Trastorno Autístico/genética , Aprendizaje Profundo , Exones/genética , Humanos , Discapacidad Intelectual/genética , Intrones/genética , Redes Neurales de la Computación , Precursores del ARN/metabolismo , Sitios de Empalme de ARN/genética , Sitios de Empalme de ARN/fisiología
10.
Cell ; 179(7): 1661-1676.e19, 2019 Dec 12.
Artículo en Inglés | MEDLINE | ID: mdl-31835038

RESUMEN

Reliable detection of disseminated tumor cells and of the biodistribution of tumor-targeting therapeutic antibodies within the entire body has long been needed to better understand and treat cancer metastasis. Here, we developed an integrated pipeline for automated quantification of cancer metastases and therapeutic antibody targeting, named DeepMACT. First, we enhanced the fluorescent signal of cancer cells more than 100-fold by applying the vDISCO method to image metastasis in transparent mice. Second, we developed deep learning algorithms for automated quantification of metastases with an accuracy matching human expert manual annotation. Deep learning-based quantification in 5 different metastatic cancer models including breast, lung, and pancreatic cancer with distinct organotropisms allowed us to systematically analyze features such as size, shape, spatial distribution, and the degree to which metastases are targeted by a therapeutic monoclonal antibody in entire mice. DeepMACT can thus considerably improve the discovery of effective antibody-based therapeutics at the pre-clinical stage. VIDEO ABSTRACT.


Asunto(s)
Anticuerpos/uso terapéutico , Aprendizaje Profundo , Diagnóstico por Computador/métodos , Quimioterapia Asistida por Computador/métodos , Neoplasias/patología , Animales , Humanos , Células MCF-7 , Ratones , Ratones Endogámicos C57BL , Ratones Desnudos , Ratones SCID , Metástasis de la Neoplasia , Neoplasias/diagnóstico por imagen , Neoplasias/tratamiento farmacológico , Programas Informáticos , Microambiente Tumoral
11.
Cell ; 172(5): 893-895, 2018 02 22.
Artículo en Inglés | MEDLINE | ID: mdl-29474917

RESUMEN

Kermany et al. report an application of a neural network trained on millions of everyday images to a database of thousands of retinal tomography images that they gathered and expert labeled, resulting in a rapid and accurate diagnosis of retinal diseases.


Asunto(s)
Aprendizaje Profundo , Enfermedades de la Retina , Humanos , Redes Neurales de la Computación
12.
Cell ; 175(1): 266-276.e13, 2018 09 20.
Artículo en Inglés | MEDLINE | ID: mdl-30166209

RESUMEN

A fundamental challenge of biology is to understand the vast heterogeneity of cells, particularly how cellular composition, structure, and morphology are linked to cellular physiology. Unfortunately, conventional technologies are limited in uncovering these relations. We present a machine-intelligence technology based on a radically different architecture that realizes real-time image-based intelligent cell sorting at an unprecedented rate. This technology, which we refer to as intelligent image-activated cell sorting, integrates high-throughput cell microscopy, focusing, and sorting on a hybrid software-hardware data-management infrastructure, enabling real-time automated operation for data acquisition, data processing, decision-making, and actuation. We use it to demonstrate real-time sorting of microalgal and blood cells based on intracellular protein localization and cell-cell interaction from large heterogeneous populations for studying photosynthesis and atherothrombosis, respectively. The technology is highly versatile and expected to enable machine-based scientific discovery in biological, pharmaceutical, and medical sciences.


Asunto(s)
Citometría de Flujo/métodos , Ensayos Analíticos de Alto Rendimiento/métodos , Procesamiento de Imagen Asistido por Computador/métodos , Animales , Aprendizaje Profundo , Humanos
13.
Cell ; 172(5): 1122-1131.e9, 2018 02 22.
Artículo en Inglés | MEDLINE | ID: mdl-29474911

RESUMEN

The implementation of clinical-decision support algorithms for medical imaging faces challenges with reliability and interpretability. Here, we establish a diagnostic tool based on a deep-learning framework for the screening of patients with common treatable blinding retinal diseases. Our framework utilizes transfer learning, which trains a neural network with a fraction of the data of conventional approaches. Applying this approach to a dataset of optical coherence tomography images, we demonstrate performance comparable to that of human experts in classifying age-related macular degeneration and diabetic macular edema. We also provide a more transparent and interpretable diagnosis by highlighting the regions recognized by the neural network. We further demonstrate the general applicability of our AI system for diagnosis of pediatric pneumonia using chest X-ray images. This tool may ultimately aid in expediting the diagnosis and referral of these treatable conditions, thereby facilitating earlier treatment, resulting in improved clinical outcomes. VIDEO ABSTRACT.


Asunto(s)
Aprendizaje Profundo , Diagnóstico por Imagen , Neumonía/diagnóstico , Niño , Humanos , Redes Neurales de la Computación , Neumonía/diagnóstico por imagen , Curva ROC , Reproducibilidad de los Resultados , Tomografía de Coherencia Óptica
14.
Immunity ; 56(7): 1681-1698.e13, 2023 07 11.
Artículo en Inglés | MEDLINE | ID: mdl-37301199

RESUMEN

CD4+ T cell responses are exquisitely antigen specific and directed toward peptide epitopes displayed by human leukocyte antigen class II (HLA-II) on antigen-presenting cells. Underrepresentation of diverse alleles in ligand databases and an incomplete understanding of factors affecting antigen presentation in vivo have limited progress in defining principles of peptide immunogenicity. Here, we employed monoallelic immunopeptidomics to identify 358,024 HLA-II binders, with a particular focus on HLA-DQ and HLA-DP. We uncovered peptide-binding patterns across a spectrum of binding affinities and enrichment of structural antigen features. These aspects underpinned the development of context-aware predictor of T cell antigens (CAPTAn), a deep learning model that predicts peptide antigens based on their affinity to HLA-II and full sequence of their source proteins. CAPTAn was instrumental in discovering prevalent T cell epitopes from bacteria in the human microbiome and a pan-variant epitope from SARS-CoV-2. Together CAPTAn and associated datasets present a resource for antigen discovery and the unraveling genetic associations of HLA alleles with immunopathologies.


Asunto(s)
COVID-19 , Aprendizaje Profundo , Humanos , Captano , SARS-CoV-2 , Antígenos HLA , Epítopos de Linfocito T , Péptidos
16.
Mol Cell ; 83(14): 2595-2611.e11, 2023 07 20.
Artículo en Inglés | MEDLINE | ID: mdl-37421941

RESUMEN

RNA-binding proteins (RBPs) control RNA metabolism to orchestrate gene expression and, when dysfunctional, underlie human diseases. Proteome-wide discovery efforts predict thousands of RBP candidates, many of which lack canonical RNA-binding domains (RBDs). Here, we present a hybrid ensemble RBP classifier (HydRA), which leverages information from both intermolecular protein interactions and internal protein sequence patterns to predict RNA-binding capacity with unparalleled specificity and sensitivity using support vector machines (SVMs), convolutional neural networks (CNNs), and Transformer-based protein language models. Occlusion mapping by HydRA robustly detects known RBDs and predicts hundreds of uncharacterized RNA-binding associated domains. Enhanced CLIP (eCLIP) for HydRA-predicted RBP candidates reveals transcriptome-wide RNA targets and confirms RNA-binding activity for HydRA-predicted RNA-binding associated domains. HydRA accelerates construction of a comprehensive RBP catalog and expands the diversity of RNA-binding associated domains.


Asunto(s)
Aprendizaje Profundo , Hydra , Animales , Humanos , ARN/metabolismo , Unión Proteica , Sitios de Unión/genética , Hydra/genética , Hydra/metabolismo
17.
Nat Rev Genet ; 25(1): 61-78, 2024 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-37666948

RESUMEN

In population genetics, the emergence of large-scale genomic data for various species and populations has provided new opportunities to understand the evolutionary forces that drive genetic diversity using statistical inference. However, the era of population genomics presents new challenges in analysing the massive amounts of genomes and variants. Deep learning has demonstrated state-of-the-art performance for numerous applications involving large-scale data. Recently, deep learning approaches have gained popularity in population genetics; facilitated by the advent of massive genomic data sets, powerful computational hardware and complex deep learning architectures, they have been used to identify population structure, infer demographic history and investigate natural selection. Here, we introduce common deep learning architectures and provide comprehensive guidelines for implementing deep learning models for population genetic inference. We also discuss current challenges and future directions for applying deep learning in population genetics, focusing on efficiency, robustness and interpretability.


Asunto(s)
Aprendizaje Profundo , Genómica , Genética de Población , Genoma , Evolución Biológica
18.
Nature ; 632(8025): 594-602, 2024 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-38862024

RESUMEN

Animals have exquisite control of their bodies, allowing them to perform a diverse range of behaviours. How such control is implemented by the brain, however, remains unclear. Advancing our understanding requires models that can relate principles of control to the structure of neural activity in behaving animals. Here, to facilitate this, we built a 'virtual rodent', in which an artificial neural network actuates a biomechanically realistic model of the rat1 in a physics simulator2. We used deep reinforcement learning3-5 to train the virtual agent to imitate the behaviour of freely moving rats, thus allowing us to compare neural activity recorded in real rats to the network activity of a virtual rodent mimicking their behaviour. We found that neural activity in the sensorimotor striatum and motor cortex was better predicted by the virtual rodent's network activity than by any features of the real rat's movements, consistent with both regions implementing inverse dynamics6. Furthermore, the network's latent variability predicted the structure of neural variability across behaviours and afforded robustness in a way consistent with the minimal intervention principle of optimal feedback control7. These results demonstrate how physical simulation of biomechanically realistic virtual animals can help interpret the structure of neural activity across behaviour and relate it to theoretical principles of motor control.


Asunto(s)
Conducta Animal , Modelos Neurológicos , Redes Neurales de la Computación , Realidad Virtual , Animales , Ratas , Conducta Animal/fisiología , Aprendizaje Profundo , Corteza Motora/fisiología , Movimiento/fisiología , Corteza Sensoriomotora/fisiología , Femenino , Ratas Long-Evans
19.
Nature ; 630(8016): 493-500, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38718835

RESUMEN

The introduction of AlphaFold 21 has spurred a revolution in modelling the structure of proteins and their interactions, enabling a huge range of applications in protein modelling and design2-6. Here we describe our AlphaFold 3 model with a substantially updated diffusion-based architecture that is capable of predicting the joint structure of complexes including proteins, nucleic acids, small molecules, ions and modified residues. The new AlphaFold model demonstrates substantially improved accuracy over many previous specialized tools: far greater accuracy for protein-ligand interactions compared with state-of-the-art docking tools, much higher accuracy for protein-nucleic acid interactions compared with nucleic-acid-specific predictors and substantially higher antibody-antigen prediction accuracy compared with AlphaFold-Multimer v.2.37,8. Together, these results show that high-accuracy modelling across biomolecular space is possible within a single unified deep-learning framework.


Asunto(s)
Aprendizaje Profundo , Ligandos , Modelos Moleculares , Proteínas , Programas Informáticos , Humanos , Anticuerpos/química , Anticuerpos/metabolismo , Antígenos/metabolismo , Antígenos/química , Aprendizaje Profundo/normas , Iones/química , Iones/metabolismo , Simulación del Acoplamiento Molecular , Ácidos Nucleicos/química , Ácidos Nucleicos/metabolismo , Unión Proteica , Conformación Proteica , Proteínas/química , Proteínas/metabolismo , Reproducibilidad de los Resultados , Programas Informáticos/normas
20.
Nature ; 626(7997): 207-211, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-38086418

RESUMEN

Enhancers control gene expression and have crucial roles in development and homeostasis1-3. However, the targeted de novo design of enhancers with tissue-specific activities has remained challenging. Here we combine deep learning and transfer learning to design tissue-specific enhancers for five tissues in the Drosophila melanogaster embryo: the central nervous system, epidermis, gut, muscle and brain. We first train convolutional neural networks using genome-wide single-cell assay for transposase-accessible chromatin with sequencing (ATAC-seq) datasets and then fine-tune the convolutional neural networks with smaller-scale data from in vivo enhancer activity assays, yielding models with 13% to 76% positive predictive value according to cross-validation. We designed and experimentally assessed 40 synthetic enhancers (8 per tissue) in vivo, of which 31 (78%) were active and 27 (68%) functioned in the target tissue (100% for central nervous system and muscle). The strategy of combining genome-wide and small-scale functional datasets by transfer learning is generally applicable and should enable the design of tissue-, cell type- and cell state-specific enhancers in any system.


Asunto(s)
Aprendizaje Profundo , Drosophila melanogaster , Embrión no Mamífero , Elementos de Facilitación Genéticos , Redes Neurales de la Computación , Especificidad de Órganos , Animales , Cromatina/genética , Cromatina/metabolismo , Conjuntos de Datos como Asunto , Drosophila melanogaster/embriología , Drosophila melanogaster/genética , Embrión no Mamífero/embriología , Embrión no Mamífero/metabolismo , Elementos de Facilitación Genéticos/genética , Especificidad de Órganos/genética , Reproducibilidad de los Resultados , Análisis de la Célula Individual , Transposasas/metabolismo , Biología Sintética/métodos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA