Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 3 de 3
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Proc Natl Acad Sci U S A ; 120(15): e2216698120, 2023 04 11.
Artigo em Inglês | MEDLINE | ID: mdl-37023129

RESUMO

Discovering DNA regulatory sequence motifs and their relative positions is vital to understanding the mechanisms of gene expression regulation. Although deep convolutional neural networks (CNNs) have achieved great success in predicting cis-regulatory elements, the discovery of motifs and their combinatorial patterns from these CNN models has remained difficult. We show that the main difficulty is due to the problem of multifaceted neurons which respond to multiple types of sequence patterns. Since existing interpretation methods were mainly designed to visualize the class of sequences that can activate the neuron, the resulting visualization will correspond to a mixture of patterns. Such a mixture is usually difficult to interpret without resolving the mixed patterns. We propose the NeuronMotif algorithm to interpret such neurons. Given any convolutional neuron (CN) in the network, NeuronMotif first generates a large sample of sequences capable of activating the CN, which typically consists of a mixture of patterns. Then, the sequences are "demixed" in a layer-wise manner by backward clustering of the feature maps of the involved convolutional layers. NeuronMotif can output the sequence motifs, and the syntax rules governing their combinations are depicted by position weight matrices organized in tree structures. Compared to existing methods, the motifs found by NeuronMotif have more matches to known motifs in the JASPAR database. The higher-order patterns uncovered for deep CNs are supported by the literature and ATAC-seq footprinting. Overall, NeuronMotif enables the deciphering of cis-regulatory codes from deep CNs and enhances the utility of CNN in genome interpretation.


Assuntos
Algoritmos , Redes Neurais de Computação , Motivos de Nucleotídeos/genética , Sequências Reguladoras de Ácido Nucleico/genética , Bases de Dados Factuais
2.
Front Mol Biosci ; 8: 673363, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34179082

RESUMO

Data-driven machine learning is the method of choice for predicting molecular phenotypes from nucleotide sequence, modeling gene expression events including protein-DNA binding, chromatin states as well as mRNA and protein levels. Deep neural networks automatically learn informative sequence representations and interpreting them enables us to improve our understanding of the regulatory code governing gene expression. Here, we review the latest developments that apply shallow or deep learning to quantify molecular phenotypes and decode the cis-regulatory grammar from prokaryotic and eukaryotic sequencing data. Our approach is to build from the ground up, first focusing on the initiating protein-DNA interactions, then specific coding and non-coding regions, and finally on advances that combine multiple parts of the gene and mRNA regulatory structures, achieving unprecedented performance. We thus provide a quantitative view of gene expression regulation from nucleotide sequence, concluding with an information-centric overview of the central dogma of molecular biology.

3.
Elife ; 52016 05 06.
Artigo em Inglês | MEDLINE | ID: mdl-27152947

RESUMO

Enhancers constitute one of the major components of regulatory machinery of metazoans. Although several genome-wide studies have focused on finding and locating enhancers in the genomes, the fundamental principles governing their internal architecture and cis-regulatory grammar remain elusive. Here, we describe an extensive, quantitative perturbation analysis targeting the dorsal-ventral patterning gene regulatory network (GRN) controlled by Drosophila NF-κB homolog Dorsal. To understand transcription factor interactions on enhancers, we employed an ensemble of mathematical models, testing effects of cooperativity, repression, and factor potency. Models trained on the dataset correctly predict activity of evolutionarily divergent regulatory regions, providing insights into spatial relationships between repressor and activator binding sites. Importantly, the collective predictions of sets of models were effective at novel enhancer identification and characterization. Our study demonstrates how experimental dataset and modeling can be effectively combined to provide quantitative insights into cis-regulatory information on a genome-wide scale.


Assuntos
Padronização Corporal/genética , Proteínas de Drosophila/genética , Drosophila melanogaster/genética , Elementos Facilitadores Genéticos , Proteínas Nucleares/genética , Fosfoproteínas/genética , Fatores de Transcrição/genética , Animais , Drosophila melanogaster/crescimento & desenvolvimento , Regulação da Expressão Gênica no Desenvolvimento , Redes Reguladoras de Genes/genética , Genoma de Inseto , Modelos Teóricos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA