Búsqueda | Portal de Búsqueda de la BVS Ecuador

A synergistic DNA logic predicts genome-wide chromatin accessibility.

Hashimoto, Tatsunori; Sherwood, Richard I; Kang, Daniel D; Rajagopal, Nisha; Barkal, Amira A; Zeng, Haoyang; Emons, Bart J M; Srinivasan, Sharanya; Jaakkola, Tommi; Gifford, David K.

Genome Res ; 26(10): 1430-1440, 2016 10.

Artículo en Inglés | MEDLINE | ID: mdl-27456004

RESUMEN

Enhancers and promoters commonly occur in accessible chromatin characterized by depleted nucleosome contact; however, it is unclear how chromatin accessibility is governed. We show that log-additive cis-acting DNA sequence features can predict chromatin accessibility at high spatial resolution. We develop a new type of high-dimensional machine learning model, the Synergistic Chromatin Model (SCM), which when trained with DNase-seq data for a cell type is capable of predicting expected read counts of genome-wide chromatin accessibility at every base from DNA sequence alone, with the highest accuracy at hypersensitive sites shared across cell types. We confirm that a SCM accurately predicts chromatin accessibility for thousands of synthetic DNA sequences using a novel CRISPR-based method of highly efficient site-specific DNA library integration. SCMs are directly interpretable and reveal that a logic based on local, nonspecific synergistic effects, largely among pioneer TFs, is sufficient to predict a large fraction of cellular chromatin accessibility in a wide variety of cell types.

Asunto(s)

Ensamble y Desensamble de Cromatina , Cromatina/genética , Modelos Genéticos , Animales , Cromatina/metabolismo , Genoma Humano , Humanos , Aprendizaje Automático

GERV: a statistical method for generative evaluation of regulatory variants for transcription factor binding.

Zeng, Haoyang; Hashimoto, Tatsunori; Kang, Daniel D; Gifford, David K.

Bioinformatics ; 32(4): 490-6, 2016 Feb 15.

Artículo en Inglés | MEDLINE | ID: mdl-26476779

RESUMEN

MOTIVATION: The majority of disease-associated variants identified in genome-wide association studies reside in noncoding regions of the genome with regulatory roles. Thus being able to interpret the functional consequence of a variant is essential for identifying causal variants in the analysis of genome-wide association studies. RESULTS: We present GERV (generative evaluation of regulatory variants), a novel computational method for predicting regulatory variants that affect transcription factor binding. GERV learns a k-mer-based generative model of transcription factor binding from ChIP-seq and DNase-seq data, and scores variants by computing the change of predicted ChIP-seq reads between the reference and alternate allele. The k-mers learned by GERV capture more sequence determinants of transcription factor binding than a motif-based approach alone, including both a transcription factor's canonical motif and associated co-factor motifs. We show that GERV outperforms existing methods in predicting single-nucleotide polymorphisms associated with allele-specific binding. GERV correctly predicts a validated causal variant among linked single-nucleotide polymorphisms and prioritizes the variants previously reported to modulate the binding of FOXA1 in breast cancer cell lines. Thus, GERV provides a powerful approach for functionally annotating and prioritizing causal variants for experimental follow-up analysis. AVAILABILITY AND IMPLEMENTATION: The implementation of GERV and related data are available at http://gerv.csail.mit.edu/.

Asunto(s)

Algoritmos , Biología Computacional/métodos , Modelos Estadísticos , Polimorfismo de Nucleótido Simple/genética , Secuencias Reguladoras de Ácidos Nucleicos/genética , Factores de Transcripción/metabolismo , Sitios de Unión , Inmunoprecipitación de Cromatina , Genoma Humano , Estudio de Asociación del Genoma Completo , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Anotación de Secuencia Molecular , Unión Proteica

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA