Your browser doesn't support javascript.
loading
Exploiting regulatory heterogeneity to systematically identify enhancers with high accuracy.
Arbel, Hamutal; Basu, Sumanta; Fisher, William W; Hammonds, Ann S; Wan, Kenneth H; Park, Soo; Weiszmann, Richard; Booth, Benjamin W; Keranen, Soile V; Henriquez, Clara; Shams Solari, Omid; Bickel, Peter J; Biggin, Mark D; Celniker, Susan E; Brown, James B.
Afiliación
  • Arbel H; Molecular Ecosystems Biology Department, Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA 94720; taly@berkeley.edu bickel@stat.berkeley.edu SECelniker@lbl.gov jbbrown@lbl.gov.
  • Basu S; Department of Statistics, University of California, Berkeley, CA 97420.
  • Fisher WW; Molecular Ecosystems Biology Department, Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA 94720.
  • Hammonds AS; Department of Statistics, University of California, Berkeley, CA 97420.
  • Wan KH; Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14850.
  • Park S; Biological Systems and Engineering Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720.
  • Weiszmann R; Biological Systems and Engineering Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720.
  • Booth BW; Biological Systems and Engineering Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720.
  • Keranen SV; Biological Systems and Engineering Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720.
  • Henriquez C; Biological Systems and Engineering Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720.
  • Shams Solari O; Biological Systems and Engineering Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720.
  • Bickel PJ; Biological Systems and Engineering Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720.
  • Biggin MD; Biological Systems and Engineering Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720.
  • Celniker SE; Department of Statistics, University of California, Berkeley, CA 97420.
  • Brown JB; Department of Statistics, University of California, Berkeley, CA 97420; taly@berkeley.edu bickel@stat.berkeley.edu SECelniker@lbl.gov jbbrown@lbl.gov.
Proc Natl Acad Sci U S A ; 116(3): 900-908, 2019 01 15.
Article en En | MEDLINE | ID: mdl-30598455
Identifying functional enhancer elements in metazoan systems is a major challenge. Large-scale validation of enhancers predicted by ENCODE reveal false-positive rates of at least 70%. We used the pregrastrula-patterning network of Drosophila melanogaster to demonstrate that loss in accuracy in held-out data results from heterogeneity of functional signatures in enhancer elements. We show that at least two classes of enhancers are active during early Drosophila embryogenesis and that by focusing on a single, relatively homogeneous class of elements, greater than 98% prediction accuracy can be achieved in a balanced, completely held-out test set. The class of well-predicted elements is composed predominantly of enhancers driving multistage segmentation patterns, which we designate segmentation driving enhancers (SDE). Prediction is driven by the DNA occupancy of early developmental transcription factors, with almost no additional power derived from histone modifications. We further show that improved accuracy is not a property of a particular prediction method: after conditioning on the SDE set, naïve Bayes and logistic regression perform as well as more sophisticated tools. Applying this method to a genome-wide scan, we predict 1,640 SDEs that cover 1.6% of the genome. An analysis of 32 SDEs using whole-mount embryonic imaging of stably integrated reporter constructs chosen throughout our prediction rank-list showed >90% drove expression patterns. We achieved 86.7% precision on a genome-wide scan, with an estimated recall of at least 98%, indicating high accuracy and completeness in annotating this class of functional elements.
Asunto(s)
Palabras clave

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Factores de Transcripción / Elementos de Facilitación Genéticos / Análisis de Secuencia de ADN / Proteínas de Drosophila / Desarrollo Embrionario / Embrión no Mamífero Tipo de estudio: Prognostic_studies Límite: Animals Idioma: En Revista: Proc Natl Acad Sci U S A Año: 2019 Tipo del documento: Article

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Factores de Transcripción / Elementos de Facilitación Genéticos / Análisis de Secuencia de ADN / Proteínas de Drosophila / Desarrollo Embrionario / Embrión no Mamífero Tipo de estudio: Prognostic_studies Límite: Animals Idioma: En Revista: Proc Natl Acad Sci U S A Año: 2019 Tipo del documento: Article