Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 2 de 2
Filtrar
Más filtros

Banco de datos
Tipo del documento
País de afiliación
Intervalo de año de publicación
1.
Genome Res ; 26(10): 1430-1440, 2016 10.
Artículo en Inglés | MEDLINE | ID: mdl-27456004

RESUMEN

Enhancers and promoters commonly occur in accessible chromatin characterized by depleted nucleosome contact; however, it is unclear how chromatin accessibility is governed. We show that log-additive cis-acting DNA sequence features can predict chromatin accessibility at high spatial resolution. We develop a new type of high-dimensional machine learning model, the Synergistic Chromatin Model (SCM), which when trained with DNase-seq data for a cell type is capable of predicting expected read counts of genome-wide chromatin accessibility at every base from DNA sequence alone, with the highest accuracy at hypersensitive sites shared across cell types. We confirm that a SCM accurately predicts chromatin accessibility for thousands of synthetic DNA sequences using a novel CRISPR-based method of highly efficient site-specific DNA library integration. SCMs are directly interpretable and reveal that a logic based on local, nonspecific synergistic effects, largely among pioneer TFs, is sufficient to predict a large fraction of cellular chromatin accessibility in a wide variety of cell types.


Asunto(s)
Ensamble y Desensamble de Cromatina , Cromatina/genética , Modelos Genéticos , Animales , Cromatina/metabolismo , Genoma Humano , Humanos , Aprendizaje Automático
2.
Bioinformatics ; 32(4): 490-6, 2016 Feb 15.
Artículo en Inglés | MEDLINE | ID: mdl-26476779

RESUMEN

MOTIVATION: The majority of disease-associated variants identified in genome-wide association studies reside in noncoding regions of the genome with regulatory roles. Thus being able to interpret the functional consequence of a variant is essential for identifying causal variants in the analysis of genome-wide association studies. RESULTS: We present GERV (generative evaluation of regulatory variants), a novel computational method for predicting regulatory variants that affect transcription factor binding. GERV learns a k-mer-based generative model of transcription factor binding from ChIP-seq and DNase-seq data, and scores variants by computing the change of predicted ChIP-seq reads between the reference and alternate allele. The k-mers learned by GERV capture more sequence determinants of transcription factor binding than a motif-based approach alone, including both a transcription factor's canonical motif and associated co-factor motifs. We show that GERV outperforms existing methods in predicting single-nucleotide polymorphisms associated with allele-specific binding. GERV correctly predicts a validated causal variant among linked single-nucleotide polymorphisms and prioritizes the variants previously reported to modulate the binding of FOXA1 in breast cancer cell lines. Thus, GERV provides a powerful approach for functionally annotating and prioritizing causal variants for experimental follow-up analysis. AVAILABILITY AND IMPLEMENTATION: The implementation of GERV and related data are available at http://gerv.csail.mit.edu/.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Modelos Estadísticos , Polimorfismo de Nucleótido Simple/genética , Secuencias Reguladoras de Ácidos Nucleicos/genética , Factores de Transcripción/metabolismo , Sitios de Unión , Inmunoprecipitación de Cromatina , Genoma Humano , Estudio de Asociación del Genoma Completo , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Anotación de Secuencia Molecular , Unión Proteica
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA