Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 13 de 13
Filtrar
1.
Cell ; 173(7): 1581-1592, 2018 06 14.
Artigo em Inglês | MEDLINE | ID: mdl-29887378

RESUMO

Machine learning, a collection of data-analytical techniques aimed at building predictive models from multi-dimensional datasets, is becoming integral to modern biological research. By enabling one to generate models that learn from large datasets and make predictions on likely outcomes, machine learning can be used to study complex cellular systems such as biological networks. Here, we provide a primer on machine learning for life scientists, including an introduction to deep learning. We discuss opportunities and challenges at the intersection of machine learning and network biology, which could impact disease biology, drug discovery, microbiome research, and synthetic biology.


Assuntos
Biologia Computacional/métodos , Aprendizado de Máquina , Algoritmos , Bases de Dados Factuais , Descoberta de Drogas , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Humanos , Microbiota , Redes Neurais de Computação
2.
Proc Natl Acad Sci U S A ; 121(24): e2318124121, 2024 Jun 11.
Artigo em Inglês | MEDLINE | ID: mdl-38830100

RESUMO

There is much excitement about the opportunity to harness the power of large language models (LLMs) when building problem-solving assistants. However, the standard methodology of evaluating LLMs relies on static pairs of inputs and outputs; this is insufficient for making an informed decision about which LLMs are best to use in an interactive setting, and how that varies by setting. Static assessment therefore limits how we understand language model capabilities. We introduce CheckMate, an adaptable prototype platform for humans to interact with and evaluate LLMs. We conduct a study with CheckMate to evaluate three language models (InstructGPT, ChatGPT, and GPT-4) as assistants in proving undergraduate-level mathematics, with a mixed cohort of participants from undergraduate students to professors of mathematics. We release the resulting interaction and rating dataset, MathConverse. By analyzing MathConverse, we derive a taxonomy of human query behaviors and uncover that despite a generally positive correlation, there are notable instances of divergence between correctness and perceived helpfulness in LLM generations, among other findings. Further, we garner a more granular understanding of GPT-4 mathematical problem-solving through a series of case studies, contributed by experienced mathematicians. We conclude with actionable takeaways for ML practitioners and mathematicians: models that communicate uncertainty, respond well to user corrections, and can provide a concise rationale for their recommendations, may constitute better assistants. Humans should inspect LLM output carefully given their current shortcomings and potential for surprising fallibility.


Assuntos
Idioma , Matemática , Resolução de Problemas , Humanos , Resolução de Problemas/fisiologia , Estudantes/psicologia
3.
Proc Natl Acad Sci U S A ; 118(27)2021 07 06.
Artigo em Inglês | MEDLINE | ID: mdl-34187888

RESUMO

Recent progress in DNA synthesis and sequencing technology has enabled systematic studies of protein function at a massive scale. We explore a deep mutational scanning study that measured the transcriptional repression function of 43,669 variants of the Escherichia coli LacI protein. We analyze structural and evolutionary aspects that relate to how the function of this protein is maintained, including an in-depth look at the C-terminal domain. We develop a deep neural network to predict transcriptional repression mediated by the lac repressor of Escherichia coli using experimental measurements of variant function. When measured across 10 separate training and validation splits using 5,009 single mutations of the lac repressor, our best-performing model achieved a median Pearson correlation of 0.79, exceeding any previous model. We demonstrate that deep representation learning approaches, first trained in an unsupervised manner across millions of diverse proteins, can be fine-tuned in a supervised fashion using lac repressor experimental datasets to more effectively predict a variant's effect on repression. These findings suggest a deep representation learning model may improve the prediction of other important properties of proteins.


Assuntos
Aprendizado Profundo , Proteínas de Escherichia coli/metabolismo , Repressores Lac/metabolismo , Transcrição Gênica , Epistasia Genética , Proteínas de Escherichia coli/genética , Repressores Lac/genética , Mutação/genética , Domínios Proteicos , Reprodutibilidade dos Testes
4.
Fam Community Health ; 45(2): 59-66, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35125488

RESUMO

Mixed-status families-whose members have multiple immigration statuses-are common in US immigrant communities. Large-scale worksite raids, an immigration enforcement tactic used throughout US history, returned during the Trump administration. Yet, little research characterizes the impacts of these raids, especially as related to mixed-status families. The current study (1) describes a working definition of a large-scale worksite raid and (2) considers impacts of these raids on mixed-status families. We conducted semistructured interviews in Spanish and English at 6 communities that experienced the largest worksite raids in 2018. Participants were 77 adults who provided material, emotional, or professional support following raids. Qualitative analysis methods were used to develop a codebook and code all interviews. The unpredictability of worksite raids resulted in chaos and confusion, often stemming from potential family separation. Financial crises followed because of the removal of primary financial providers. In response, families rearranged roles to generate income. Large-scale worksite raids result in similar harms to mixed-status families as other enforcement tactics but on a much larger scale. They also uniquely drain community resources, with long-term impacts. Advocacy and policy efforts are needed to mitigate damage and end this practice.


Assuntos
Emigrantes e Imigrantes , Emigração e Imigração , Adulto , Relações Familiares , Hispânico ou Latino , Humanos , Local de Trabalho
5.
RNA Biol ; 18(sup2): 770-781, 2021 11 12.
Artigo em Inglês | MEDLINE | ID: mdl-34719327

RESUMO

TUT4 and the closely related TUT7 are non-templated poly(U) polymerases required at different stages of development, and their mis-regulation or mutation has been linked to important cancer pathologies. While TUT4(7) interaction with its pre-miRNA targets has been characterized in detail, the molecular bases of the broader target recognition process are unclear. Here, we examine RNA binding by the ZnF domains of the protein. We show that TUT4(7) ZnF2 contains two distinct RNA binding surfaces that are used in the interaction with different RNA nucleobases in different targets, i.e that this small domain encodes diversity in TUT4(7) selectivity and molecular function. Interestingly and unlike other well-characterized CCHC ZnFs, ZnF2 is not physically coupled to the flanking ZnF3 and acts independently in miRNA recognition, while the remaining CCHC ZnF of TUT4(7), ZnF1, has lost its intrinsic RNA binding capability. Together, our data suggest that the ZnFs of TUT4(7) are independent units for RNA and, possibly, protein-protein interactions that underlay the protein's functional flexibility and are likely to play an important role in building its interaction network.


Assuntos
Proteínas de Ligação a DNA/metabolismo , Epistasia Genética , Regulação da Expressão Gênica , MicroRNAs/genética , Proteínas de Ligação a RNA/metabolismo , Dedos de Zinco , Composição de Bases , Proteínas de Ligação a DNA/química , Humanos , Espectroscopia de Ressonância Magnética , MicroRNAs/química , MicroRNAs/metabolismo , Poli U , Domínios e Motivos de Interação entre Proteínas , Proteínas de Ligação a RNA/química , Relação Estrutura-Atividade
6.
Nucleic Acids Res ; 45(11): 6761-6774, 2017 Jun 20.
Artigo em Inglês | MEDLINE | ID: mdl-28379442

RESUMO

RBM10 is an RNA-binding protein that plays an essential role in development and is frequently mutated in the context of human disease. RBM10 recognizes a diverse set of RNA motifs in introns and exons and regulates alternative splicing. However, the molecular mechanisms underlying this seemingly relaxed sequence specificity are not understood and functional studies have focused on 3΄ intronic sites only. Here, we dissect the RNA code recognized by RBM10 and relate it to the splicing regulatory function of this protein. We show that a two-domain RRM1-ZnF unit recognizes a GGA-centered motif enriched in RBM10 exonic sites with high affinity and specificity and test that the interaction with these exonic sequences promotes exon skipping. Importantly, a second RRM domain (RRM2) of RBM10 recognizes a C-rich sequence, which explains its known interaction with the intronic 3΄ site of NUMB exon 9 contributing to regulation of the Notch pathway in cancer. Together, these findings explain RBM10's broad RNA specificity and suggest that RBM10 functions as a splicing regulator using two RNA-binding units with different specificities to promote exon skipping.


Assuntos
Proteínas de Ligação a RNA/fisiologia , Autoantígenos , Sequência de Bases , Sítios de Ligação , Éxons , Células HEK293 , Humanos , Ligação Proteica , Splicing de RNA , RNA Mensageiro/química , RNA Mensageiro/metabolismo , Proteínas de Ligação a RNA/química , Dedos de Zinco
7.
Nucleic Acids Res ; 43(6): e41, 2015 Mar 31.
Artigo em Inglês | MEDLINE | ID: mdl-25586222

RESUMO

Defining the RNA target selectivity of the proteins regulating mRNA metabolism is a key issue in RNA biology. Here we present a novel use of principal component analysis (PCA) to extract the RNA sequence preference of RNA binding proteins. We show that PCA can be used to compare the changes in the nuclear magnetic resonance (NMR) spectrum of a protein upon binding a set of quasi-degenerate RNAs and define the nucleobase specificity. We couple this application of PCA to an automated NMR spectra recording and processing protocol and obtain an unbiased and high-throughput NMR method for the analysis of nucleobase preference in protein-RNA interactions. We test the method on the RNA binding domains of three important regulators of RNA metabolism.


Assuntos
Ensaios de Triagem em Larga Escala/métodos , Ressonância Magnética Nuclear Biomolecular/métodos , Proteínas de Ligação a RNA/metabolismo , RNA/genética , RNA/metabolismo , Sequência de Bases , Proteínas de Ligação a DNA/química , Proteínas de Ligação a DNA/metabolismo , Ensaios de Triagem em Larga Escala/estatística & dados numéricos , Humanos , Modelos Moleculares , Análise de Componente Principal , Domínios e Motivos de Interação entre Proteínas , Proteínas de Ligação a RNA/química , Proteínas Recombinantes/química , Proteínas Recombinantes/metabolismo , Proteínas de Saccharomyces cerevisiae/química , Proteínas de Saccharomyces cerevisiae/metabolismo , Fatores de Poliadenilação e Clivagem de mRNA/química , Fatores de Poliadenilação e Clivagem de mRNA/metabolismo
8.
Cell Rep Methods ; 3(6): 100508, 2023 06 26.
Artigo em Inglês | MEDLINE | ID: mdl-37426752

RESUMO

Understanding how the RNA-binding domains of a protein regulator are used to recognize its RNA targets is a key problem in RNA biology, but RNA-binding domains with very low affinity do not perform well in the methods currently available to characterize protein-RNA interactions. Here, we propose to use conservative mutations that enhance the affinity of RNA-binding domains to overcome this limitation. As a proof of principle, we have designed and validated an affinity-enhanced K-homology (KH) domain mutant of the fragile X syndrome protein FMRP, a key regulator of neuronal development, and used this mutant to determine the domain's sequence preference and to explain FMRP recognition of specific RNA motifs in the cell. Our results validate our concept and our nuclear magnetic resonance (NMR)-based workflow. While effective mutant design requires an understanding of the underlying principles of RNA recognition by the relevant domain type, we expect the method will be used effectively in many RNA-binding domains.


Assuntos
Proteína do X Frágil da Deficiência Intelectual , RNA , RNA/genética , Proteína do X Frágil da Deficiência Intelectual/genética , Proteínas/genética , Mutação , Motivos de Ligação ao RNA/genética
9.
Microorganisms ; 11(4)2023 Apr 20.
Artigo em Inglês | MEDLINE | ID: mdl-37110501

RESUMO

Bacteria use an array of sigma factors to regulate gene expression during different stages of their life cycles. Full-length, atomic-level structures of sigma factors have been challenging to obtain experimentally as a result of their many regions of intrinsic disorder. AlphaFold has now supplied plausible full-length models for most sigma factors. Here we discuss the current understanding of the structures and functions of sigma factors in the model organism, Bacillus subtilis, and present an X-ray crystal structure of a region of B. subtilis SigE, a sigma factor that plays a critical role in the developmental process of spore formation.

10.
Healthcare (Basel) ; 11(14)2023 Jul 13.
Artigo em Inglês | MEDLINE | ID: mdl-37510458

RESUMO

BACKGROUND: Alzheimer's disease's (AD) prevalence is projected to increase as the population ages and current treatments are minimally effective. Transcranial photobiomodulation (t-PBM) with near-infrared (NIR) light penetrates into the cerebral cortex, stimulates the mitochondrial respiratory chain, and increases cerebral blood flow. Preliminary data suggests t-PBM may be efficacious in improving cognition in people with early AD and amnestic mild cognitive impairment (aMCI). METHODS: In this randomized, double-blind, placebo-controlled study with aMCI and early AD participants, we will test the efficacy, safety, and impact on cognition of 24 sessions of t-PBM delivered over 8 weeks. Brain mechanisms of t-PBM in this population will be explored by testing whether the baseline tau burden (measured with 18F-MK6240), or changes in mitochondrial function over 8 weeks (assessed with 31P-MRSI), moderates the changes observed in cognitive functions after t-PBM therapy. We will also use changes in the fMRI Blood-Oxygenation-Level-Dependent (BOLD) signal after a single treatment to demonstrate t-PBM-dependent increases in prefrontal cortex blood flow. CONCLUSION: This study will test whether t-PBM, a low-cost, accessible, and user-friendly intervention, has the potential to improve cognition and function in an aMCI and early AD population.

11.
Cell Syst ; 14(6): 525-542.e9, 2023 06 21.
Artigo em Inglês | MEDLINE | ID: mdl-37348466

RESUMO

The design choices underlying machine-learning (ML) models present important barriers to entry for many biologists who aim to incorporate ML in their research. Automated machine-learning (AutoML) algorithms can address many challenges that come with applying ML to the life sciences. However, these algorithms are rarely used in systems and synthetic biology studies because they typically do not explicitly handle biological sequences (e.g., nucleotide, amino acid, or glycan sequences) and cannot be easily compared with other AutoML algorithms. Here, we present BioAutoMATED, an AutoML platform for biological sequence analysis that integrates multiple AutoML methods into a unified framework. Users are automatically provided with relevant techniques for analyzing, interpreting, and designing biological sequences. BioAutoMATED predicts gene regulation, peptide-drug interactions, and glycan annotation, and designs optimized synthetic biology components, revealing salient sequence characteristics. By automating sequence modeling, BioAutoMATED allows life scientists to incorporate ML more readily into their work.


Assuntos
Algoritmos , Aprendizado de Máquina
12.
Nat Commun ; 11(1): 5058, 2020 10 07.
Artigo em Inglês | MEDLINE | ID: mdl-33028819

RESUMO

While synthetic biology has revolutionized our approaches to medicine, agriculture, and energy, the design of completely novel biological circuit components beyond naturally-derived templates remains challenging due to poorly understood design rules. Toehold switches, which are programmable nucleic acid sensors, face an analogous design bottleneck; our limited understanding of how sequence impacts functionality often necessitates expensive, time-consuming screens to identify effective switches. Here, we introduce Sequence-based Toehold Optimization and Redesign Model (STORM) and Nucleic-Acid Speech (NuSpeak), two orthogonal and synergistic deep learning architectures to characterize and optimize toeholds. Applying techniques from computer vision and natural language processing, we 'un-box' our models using convolutional filters, attention maps, and in silico mutagenesis. Through transfer-learning, we redesign sub-optimal toehold sensors, even with sparse training data, experimentally validating their improved performance. This work provides sequence-to-function deep learning frameworks for toehold selection and design, augmenting our ability to construct potent biological circuit components and precision diagnostics.


Assuntos
Biotecnologia/métodos , Aprendizado Profundo , Engenharia Genética/métodos , Riboswitch/genética , Biologia Sintética/métodos , Sequência de Bases/genética , Simulação por Computador , Conjuntos de Dados como Assunto , Genoma Humano/genética , Genoma Viral/genética , Humanos , Modelos Genéticos , Mutagênese , Processamento de Linguagem Natural , Relação Estrutura-Atividade
13.
Structure ; 26(4): 640-648.e5, 2018 04 03.
Artigo em Inglês | MEDLINE | ID: mdl-29526435

RESUMO

Global changes in bacterial gene expression can be orchestrated by the coordinated activation/deactivation of alternative sigma (σ) factor subunits of RNA polymerase. Sigma factors themselves are regulated in myriad ways, including via anti-sigma factors. Here, we have determined the solution structure of anti-sigma factor CsfB, responsible for inhibition of two alternative sigma factors, σG and σE, during spore formation by Bacillus subtilis. CsfB assembles into a symmetrical homodimer, with each monomer bound to a single Zn2+ ion via a treble-clef zinc finger fold. Directed mutagenesis indicates that dimer formation is critical for CsfB-mediated inhibition of both σG and σE, and we have characterized these interactions in vitro. This work represents an advance in our understanding of how CsfB mediates inhibition of two alternative sigma factors to drive developmental gene expression in a bacterium.


Assuntos
Bacillus subtilis/química , Regulação Bacteriana da Expressão Gênica , Proteínas Repressoras/química , Fator sigma/química , Esporos Bacterianos/química , Zinco/química , Sequência de Aminoácidos , Bacillus subtilis/genética , Bacillus subtilis/metabolismo , Sítios de Ligação , Cátions Bivalentes , Clonagem Molecular , Cristalografia por Raios X , Escherichia coli/genética , Escherichia coli/metabolismo , Vetores Genéticos/química , Vetores Genéticos/metabolismo , Modelos Moleculares , Mutação , Ligação Proteica , Conformação Proteica em alfa-Hélice , Conformação Proteica em Folha beta , Domínios e Motivos de Interação entre Proteínas , Isoformas de Proteínas/antagonistas & inibidores , Isoformas de Proteínas/química , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , Multimerização Proteica , Proteínas Recombinantes/química , Proteínas Recombinantes/genética , Proteínas Recombinantes/metabolismo , Proteínas Repressoras/genética , Proteínas Repressoras/metabolismo , Alinhamento de Sequência , Homologia de Sequência de Aminoácidos , Fator sigma/antagonistas & inibidores , Fator sigma/genética , Fator sigma/metabolismo , Esporos Bacterianos/genética , Esporos Bacterianos/metabolismo , Zinco/metabolismo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA