Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 2 de 2
Filter
Add more filters











Database
Language
Publication year range
1.
Genome Res ; 26(10): 1430-1440, 2016 10.
Article in English | MEDLINE | ID: mdl-27456004

ABSTRACT

Enhancers and promoters commonly occur in accessible chromatin characterized by depleted nucleosome contact; however, it is unclear how chromatin accessibility is governed. We show that log-additive cis-acting DNA sequence features can predict chromatin accessibility at high spatial resolution. We develop a new type of high-dimensional machine learning model, the Synergistic Chromatin Model (SCM), which when trained with DNase-seq data for a cell type is capable of predicting expected read counts of genome-wide chromatin accessibility at every base from DNA sequence alone, with the highest accuracy at hypersensitive sites shared across cell types. We confirm that a SCM accurately predicts chromatin accessibility for thousands of synthetic DNA sequences using a novel CRISPR-based method of highly efficient site-specific DNA library integration. SCMs are directly interpretable and reveal that a logic based on local, nonspecific synergistic effects, largely among pioneer TFs, is sufficient to predict a large fraction of cellular chromatin accessibility in a wide variety of cell types.


Subject(s)
Chromatin Assembly and Disassembly , Chromatin/genetics , Models, Genetic , Animals , Chromatin/metabolism , Genome, Human , Humans , Machine Learning
2.
Bioinformatics ; 32(4): 490-6, 2016 Feb 15.
Article in English | MEDLINE | ID: mdl-26476779

ABSTRACT

MOTIVATION: The majority of disease-associated variants identified in genome-wide association studies reside in noncoding regions of the genome with regulatory roles. Thus being able to interpret the functional consequence of a variant is essential for identifying causal variants in the analysis of genome-wide association studies. RESULTS: We present GERV (generative evaluation of regulatory variants), a novel computational method for predicting regulatory variants that affect transcription factor binding. GERV learns a k-mer-based generative model of transcription factor binding from ChIP-seq and DNase-seq data, and scores variants by computing the change of predicted ChIP-seq reads between the reference and alternate allele. The k-mers learned by GERV capture more sequence determinants of transcription factor binding than a motif-based approach alone, including both a transcription factor's canonical motif and associated co-factor motifs. We show that GERV outperforms existing methods in predicting single-nucleotide polymorphisms associated with allele-specific binding. GERV correctly predicts a validated causal variant among linked single-nucleotide polymorphisms and prioritizes the variants previously reported to modulate the binding of FOXA1 in breast cancer cell lines. Thus, GERV provides a powerful approach for functionally annotating and prioritizing causal variants for experimental follow-up analysis. AVAILABILITY AND IMPLEMENTATION: The implementation of GERV and related data are available at http://gerv.csail.mit.edu/.


Subject(s)
Algorithms , Computational Biology/methods , Models, Statistical , Polymorphism, Single Nucleotide/genetics , Regulatory Sequences, Nucleic Acid/genetics , Transcription Factors/metabolism , Binding Sites , Chromatin Immunoprecipitation , Genome, Human , Genome-Wide Association Study , High-Throughput Nucleotide Sequencing , Humans , Molecular Sequence Annotation , Protein Binding
SELECTION OF CITATIONS
SEARCH DETAIL