Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 13 de 13
Filtrar
1.
Cell ; 173(7): 1581-1592, 2018 06 14.
Artículo en Inglés | MEDLINE | ID: mdl-29887378

RESUMEN

Machine learning, a collection of data-analytical techniques aimed at building predictive models from multi-dimensional datasets, is becoming integral to modern biological research. By enabling one to generate models that learn from large datasets and make predictions on likely outcomes, machine learning can be used to study complex cellular systems such as biological networks. Here, we provide a primer on machine learning for life scientists, including an introduction to deep learning. We discuss opportunities and challenges at the intersection of machine learning and network biology, which could impact disease biology, drug discovery, microbiome research, and synthetic biology.


Asunto(s)
Biología Computacional/métodos , Aprendizaje Automático , Algoritmos , Bases de Datos Factuales , Descubrimiento de Drogas , Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos , Humanos , Microbiota , Redes Neurales de la Computación
2.
Proc Natl Acad Sci U S A ; 121(24): e2318124121, 2024 Jun 11.
Artículo en Inglés | MEDLINE | ID: mdl-38830100

RESUMEN

There is much excitement about the opportunity to harness the power of large language models (LLMs) when building problem-solving assistants. However, the standard methodology of evaluating LLMs relies on static pairs of inputs and outputs; this is insufficient for making an informed decision about which LLMs are best to use in an interactive setting, and how that varies by setting. Static assessment therefore limits how we understand language model capabilities. We introduce CheckMate, an adaptable prototype platform for humans to interact with and evaluate LLMs. We conduct a study with CheckMate to evaluate three language models (InstructGPT, ChatGPT, and GPT-4) as assistants in proving undergraduate-level mathematics, with a mixed cohort of participants from undergraduate students to professors of mathematics. We release the resulting interaction and rating dataset, MathConverse. By analyzing MathConverse, we derive a taxonomy of human query behaviors and uncover that despite a generally positive correlation, there are notable instances of divergence between correctness and perceived helpfulness in LLM generations, among other findings. Further, we garner a more granular understanding of GPT-4 mathematical problem-solving through a series of case studies, contributed by experienced mathematicians. We conclude with actionable takeaways for ML practitioners and mathematicians: models that communicate uncertainty, respond well to user corrections, and can provide a concise rationale for their recommendations, may constitute better assistants. Humans should inspect LLM output carefully given their current shortcomings and potential for surprising fallibility.


Asunto(s)
Lenguaje , Matemática , Solución de Problemas , Humanos , Solución de Problemas/fisiología , Estudiantes/psicología
3.
Proc Natl Acad Sci U S A ; 118(27)2021 07 06.
Artículo en Inglés | MEDLINE | ID: mdl-34187888

RESUMEN

Recent progress in DNA synthesis and sequencing technology has enabled systematic studies of protein function at a massive scale. We explore a deep mutational scanning study that measured the transcriptional repression function of 43,669 variants of the Escherichia coli LacI protein. We analyze structural and evolutionary aspects that relate to how the function of this protein is maintained, including an in-depth look at the C-terminal domain. We develop a deep neural network to predict transcriptional repression mediated by the lac repressor of Escherichia coli using experimental measurements of variant function. When measured across 10 separate training and validation splits using 5,009 single mutations of the lac repressor, our best-performing model achieved a median Pearson correlation of 0.79, exceeding any previous model. We demonstrate that deep representation learning approaches, first trained in an unsupervised manner across millions of diverse proteins, can be fine-tuned in a supervised fashion using lac repressor experimental datasets to more effectively predict a variant's effect on repression. These findings suggest a deep representation learning model may improve the prediction of other important properties of proteins.


Asunto(s)
Aprendizaje Profundo , Proteínas de Escherichia coli/metabolismo , Represoras Lac/metabolismo , Transcripción Genética , Epistasis Genética , Proteínas de Escherichia coli/genética , Represoras Lac/genética , Mutación/genética , Dominios Proteicos , Reproducibilidad de los Resultados
4.
Fam Community Health ; 45(2): 59-66, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35125488

RESUMEN

Mixed-status families-whose members have multiple immigration statuses-are common in US immigrant communities. Large-scale worksite raids, an immigration enforcement tactic used throughout US history, returned during the Trump administration. Yet, little research characterizes the impacts of these raids, especially as related to mixed-status families. The current study (1) describes a working definition of a large-scale worksite raid and (2) considers impacts of these raids on mixed-status families. We conducted semistructured interviews in Spanish and English at 6 communities that experienced the largest worksite raids in 2018. Participants were 77 adults who provided material, emotional, or professional support following raids. Qualitative analysis methods were used to develop a codebook and code all interviews. The unpredictability of worksite raids resulted in chaos and confusion, often stemming from potential family separation. Financial crises followed because of the removal of primary financial providers. In response, families rearranged roles to generate income. Large-scale worksite raids result in similar harms to mixed-status families as other enforcement tactics but on a much larger scale. They also uniquely drain community resources, with long-term impacts. Advocacy and policy efforts are needed to mitigate damage and end this practice.


Asunto(s)
Emigrantes e Inmigrantes , Emigración e Inmigración , Adulto , Relaciones Familiares , Hispánicos o Latinos , Humanos , Lugar de Trabajo
5.
RNA Biol ; 18(sup2): 770-781, 2021 11 12.
Artículo en Inglés | MEDLINE | ID: mdl-34719327

RESUMEN

TUT4 and the closely related TUT7 are non-templated poly(U) polymerases required at different stages of development, and their mis-regulation or mutation has been linked to important cancer pathologies. While TUT4(7) interaction with its pre-miRNA targets has been characterized in detail, the molecular bases of the broader target recognition process are unclear. Here, we examine RNA binding by the ZnF domains of the protein. We show that TUT4(7) ZnF2 contains two distinct RNA binding surfaces that are used in the interaction with different RNA nucleobases in different targets, i.e that this small domain encodes diversity in TUT4(7) selectivity and molecular function. Interestingly and unlike other well-characterized CCHC ZnFs, ZnF2 is not physically coupled to the flanking ZnF3 and acts independently in miRNA recognition, while the remaining CCHC ZnF of TUT4(7), ZnF1, has lost its intrinsic RNA binding capability. Together, our data suggest that the ZnFs of TUT4(7) are independent units for RNA and, possibly, protein-protein interactions that underlay the protein's functional flexibility and are likely to play an important role in building its interaction network.


Asunto(s)
Proteínas de Unión al ADN/metabolismo , Epistasis Genética , Regulación de la Expresión Génica , MicroARNs/genética , Proteínas de Unión al ARN/metabolismo , Dedos de Zinc , Composición de Base , Proteínas de Unión al ADN/química , Humanos , Espectroscopía de Resonancia Magnética , MicroARNs/química , MicroARNs/metabolismo , Poli U , Dominios y Motivos de Interacción de Proteínas , Proteínas de Unión al ARN/química , Relación Estructura-Actividad
6.
Nucleic Acids Res ; 45(11): 6761-6774, 2017 Jun 20.
Artículo en Inglés | MEDLINE | ID: mdl-28379442

RESUMEN

RBM10 is an RNA-binding protein that plays an essential role in development and is frequently mutated in the context of human disease. RBM10 recognizes a diverse set of RNA motifs in introns and exons and regulates alternative splicing. However, the molecular mechanisms underlying this seemingly relaxed sequence specificity are not understood and functional studies have focused on 3΄ intronic sites only. Here, we dissect the RNA code recognized by RBM10 and relate it to the splicing regulatory function of this protein. We show that a two-domain RRM1-ZnF unit recognizes a GGA-centered motif enriched in RBM10 exonic sites with high affinity and specificity and test that the interaction with these exonic sequences promotes exon skipping. Importantly, a second RRM domain (RRM2) of RBM10 recognizes a C-rich sequence, which explains its known interaction with the intronic 3΄ site of NUMB exon 9 contributing to regulation of the Notch pathway in cancer. Together, these findings explain RBM10's broad RNA specificity and suggest that RBM10 functions as a splicing regulator using two RNA-binding units with different specificities to promote exon skipping.


Asunto(s)
Proteínas de Unión al ARN/fisiología , Autoantígenos , Secuencia de Bases , Sitios de Unión , Exones , Células HEK293 , Humanos , Unión Proteica , Empalme del ARN , ARN Mensajero/química , ARN Mensajero/metabolismo , Proteínas de Unión al ARN/química , Dedos de Zinc
7.
Nucleic Acids Res ; 43(6): e41, 2015 Mar 31.
Artículo en Inglés | MEDLINE | ID: mdl-25586222

RESUMEN

Defining the RNA target selectivity of the proteins regulating mRNA metabolism is a key issue in RNA biology. Here we present a novel use of principal component analysis (PCA) to extract the RNA sequence preference of RNA binding proteins. We show that PCA can be used to compare the changes in the nuclear magnetic resonance (NMR) spectrum of a protein upon binding a set of quasi-degenerate RNAs and define the nucleobase specificity. We couple this application of PCA to an automated NMR spectra recording and processing protocol and obtain an unbiased and high-throughput NMR method for the analysis of nucleobase preference in protein-RNA interactions. We test the method on the RNA binding domains of three important regulators of RNA metabolism.


Asunto(s)
Ensayos Analíticos de Alto Rendimiento/métodos , Resonancia Magnética Nuclear Biomolecular/métodos , Proteínas de Unión al ARN/metabolismo , ARN/genética , ARN/metabolismo , Secuencia de Bases , Proteínas de Unión al ADN/química , Proteínas de Unión al ADN/metabolismo , Ensayos Analíticos de Alto Rendimiento/estadística & datos numéricos , Humanos , Modelos Moleculares , Análisis de Componente Principal , Dominios y Motivos de Interacción de Proteínas , Proteínas de Unión al ARN/química , Proteínas Recombinantes/química , Proteínas Recombinantes/metabolismo , Proteínas de Saccharomyces cerevisiae/química , Proteínas de Saccharomyces cerevisiae/metabolismo , Factores de Escisión y Poliadenilación de ARNm/química , Factores de Escisión y Poliadenilación de ARNm/metabolismo
8.
Cell Rep Methods ; 3(6): 100508, 2023 06 26.
Artículo en Inglés | MEDLINE | ID: mdl-37426752

RESUMEN

Understanding how the RNA-binding domains of a protein regulator are used to recognize its RNA targets is a key problem in RNA biology, but RNA-binding domains with very low affinity do not perform well in the methods currently available to characterize protein-RNA interactions. Here, we propose to use conservative mutations that enhance the affinity of RNA-binding domains to overcome this limitation. As a proof of principle, we have designed and validated an affinity-enhanced K-homology (KH) domain mutant of the fragile X syndrome protein FMRP, a key regulator of neuronal development, and used this mutant to determine the domain's sequence preference and to explain FMRP recognition of specific RNA motifs in the cell. Our results validate our concept and our nuclear magnetic resonance (NMR)-based workflow. While effective mutant design requires an understanding of the underlying principles of RNA recognition by the relevant domain type, we expect the method will be used effectively in many RNA-binding domains.


Asunto(s)
Proteína de la Discapacidad Intelectual del Síndrome del Cromosoma X Frágil , ARN , ARN/genética , Proteína de la Discapacidad Intelectual del Síndrome del Cromosoma X Frágil/genética , Proteínas/genética , Mutación , Motivos de Unión al ARN/genética
9.
Microorganisms ; 11(4)2023 Apr 20.
Artículo en Inglés | MEDLINE | ID: mdl-37110501

RESUMEN

Bacteria use an array of sigma factors to regulate gene expression during different stages of their life cycles. Full-length, atomic-level structures of sigma factors have been challenging to obtain experimentally as a result of their many regions of intrinsic disorder. AlphaFold has now supplied plausible full-length models for most sigma factors. Here we discuss the current understanding of the structures and functions of sigma factors in the model organism, Bacillus subtilis, and present an X-ray crystal structure of a region of B. subtilis SigE, a sigma factor that plays a critical role in the developmental process of spore formation.

10.
Healthcare (Basel) ; 11(14)2023 Jul 13.
Artículo en Inglés | MEDLINE | ID: mdl-37510458

RESUMEN

BACKGROUND: Alzheimer's disease's (AD) prevalence is projected to increase as the population ages and current treatments are minimally effective. Transcranial photobiomodulation (t-PBM) with near-infrared (NIR) light penetrates into the cerebral cortex, stimulates the mitochondrial respiratory chain, and increases cerebral blood flow. Preliminary data suggests t-PBM may be efficacious in improving cognition in people with early AD and amnestic mild cognitive impairment (aMCI). METHODS: In this randomized, double-blind, placebo-controlled study with aMCI and early AD participants, we will test the efficacy, safety, and impact on cognition of 24 sessions of t-PBM delivered over 8 weeks. Brain mechanisms of t-PBM in this population will be explored by testing whether the baseline tau burden (measured with 18F-MK6240), or changes in mitochondrial function over 8 weeks (assessed with 31P-MRSI), moderates the changes observed in cognitive functions after t-PBM therapy. We will also use changes in the fMRI Blood-Oxygenation-Level-Dependent (BOLD) signal after a single treatment to demonstrate t-PBM-dependent increases in prefrontal cortex blood flow. CONCLUSION: This study will test whether t-PBM, a low-cost, accessible, and user-friendly intervention, has the potential to improve cognition and function in an aMCI and early AD population.

11.
Cell Syst ; 14(6): 525-542.e9, 2023 06 21.
Artículo en Inglés | MEDLINE | ID: mdl-37348466

RESUMEN

The design choices underlying machine-learning (ML) models present important barriers to entry for many biologists who aim to incorporate ML in their research. Automated machine-learning (AutoML) algorithms can address many challenges that come with applying ML to the life sciences. However, these algorithms are rarely used in systems and synthetic biology studies because they typically do not explicitly handle biological sequences (e.g., nucleotide, amino acid, or glycan sequences) and cannot be easily compared with other AutoML algorithms. Here, we present BioAutoMATED, an AutoML platform for biological sequence analysis that integrates multiple AutoML methods into a unified framework. Users are automatically provided with relevant techniques for analyzing, interpreting, and designing biological sequences. BioAutoMATED predicts gene regulation, peptide-drug interactions, and glycan annotation, and designs optimized synthetic biology components, revealing salient sequence characteristics. By automating sequence modeling, BioAutoMATED allows life scientists to incorporate ML more readily into their work.


Asunto(s)
Algoritmos , Aprendizaje Automático
12.
Nat Commun ; 11(1): 5058, 2020 10 07.
Artículo en Inglés | MEDLINE | ID: mdl-33028819

RESUMEN

While synthetic biology has revolutionized our approaches to medicine, agriculture, and energy, the design of completely novel biological circuit components beyond naturally-derived templates remains challenging due to poorly understood design rules. Toehold switches, which are programmable nucleic acid sensors, face an analogous design bottleneck; our limited understanding of how sequence impacts functionality often necessitates expensive, time-consuming screens to identify effective switches. Here, we introduce Sequence-based Toehold Optimization and Redesign Model (STORM) and Nucleic-Acid Speech (NuSpeak), two orthogonal and synergistic deep learning architectures to characterize and optimize toeholds. Applying techniques from computer vision and natural language processing, we 'un-box' our models using convolutional filters, attention maps, and in silico mutagenesis. Through transfer-learning, we redesign sub-optimal toehold sensors, even with sparse training data, experimentally validating their improved performance. This work provides sequence-to-function deep learning frameworks for toehold selection and design, augmenting our ability to construct potent biological circuit components and precision diagnostics.


Asunto(s)
Biotecnología/métodos , Aprendizaje Profundo , Ingeniería Genética/métodos , Riboswitch/genética , Biología Sintética/métodos , Secuencia de Bases/genética , Simulación por Computador , Conjuntos de Datos como Asunto , Genoma Humano/genética , Genoma Viral/genética , Humanos , Modelos Genéticos , Mutagénesis , Procesamiento de Lenguaje Natural , Relación Estructura-Actividad
13.
Structure ; 26(4): 640-648.e5, 2018 04 03.
Artículo en Inglés | MEDLINE | ID: mdl-29526435

RESUMEN

Global changes in bacterial gene expression can be orchestrated by the coordinated activation/deactivation of alternative sigma (σ) factor subunits of RNA polymerase. Sigma factors themselves are regulated in myriad ways, including via anti-sigma factors. Here, we have determined the solution structure of anti-sigma factor CsfB, responsible for inhibition of two alternative sigma factors, σG and σE, during spore formation by Bacillus subtilis. CsfB assembles into a symmetrical homodimer, with each monomer bound to a single Zn2+ ion via a treble-clef zinc finger fold. Directed mutagenesis indicates that dimer formation is critical for CsfB-mediated inhibition of both σG and σE, and we have characterized these interactions in vitro. This work represents an advance in our understanding of how CsfB mediates inhibition of two alternative sigma factors to drive developmental gene expression in a bacterium.


Asunto(s)
Bacillus subtilis/química , Regulación Bacteriana de la Expresión Génica , Proteínas Represoras/química , Factor sigma/química , Esporas Bacterianas/química , Zinc/química , Secuencia de Aminoácidos , Bacillus subtilis/genética , Bacillus subtilis/metabolismo , Sitios de Unión , Cationes Bivalentes , Clonación Molecular , Cristalografía por Rayos X , Escherichia coli/genética , Escherichia coli/metabolismo , Vectores Genéticos/química , Vectores Genéticos/metabolismo , Modelos Moleculares , Mutación , Unión Proteica , Conformación Proteica en Hélice alfa , Conformación Proteica en Lámina beta , Dominios y Motivos de Interacción de Proteínas , Isoformas de Proteínas/antagonistas & inhibidores , Isoformas de Proteínas/química , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , Multimerización de Proteína , Proteínas Recombinantes/química , Proteínas Recombinantes/genética , Proteínas Recombinantes/metabolismo , Proteínas Represoras/genética , Proteínas Represoras/metabolismo , Alineación de Secuencia , Homología de Secuencia de Aminoácido , Factor sigma/antagonistas & inhibidores , Factor sigma/genética , Factor sigma/metabolismo , Esporas Bacterianas/genética , Esporas Bacterianas/metabolismo , Zinc/metabolismo
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA