Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 56
Filtrar
1.
Res Sq ; 2023 Jul 19.
Artículo en Inglés | MEDLINE | ID: mdl-37503119

RESUMEN

The Encyclopedia of DNA elements (ENCODE) project is a collaborative effort to create a comprehensive catalog of functional elements in the human genome. The current database comprises more than 19000 functional genomics experiments across more than 1000 cell lines and tissues using a wide array of experimental techniques to study the chromatin structure, regulatory and transcriptional landscape of the Homo sapiens and Mus musculus genomes. All experimental data, metadata, and associated computational analyses created by the ENCODE consortium are submitted to the Data Coordination Center (DCC) for validation, tracking, storage, and distribution to community resources and the scientific community. The ENCODE project has engineered and distributed uniform processing pipelines in order to promote data provenance and reproducibility as well as allow interoperability between genomic resources and other consortia. All data files, reference genome versions, software versions, and parameters used by the pipelines are captured and available via the ENCODE Portal. The pipeline code, developed using Docker and Workflow Description Language (WDL; https://openwdl.org/) is publicly available in GitHub, with images available on Dockerhub (https://hub.docker.com), enabling access to a diverse range of biomedical researchers. ENCODE pipelines maintained and used by the DCC can be installed to run on personal computers, local HPC clusters, or in cloud computing environments via Cromwell. Access to the pipelines and data via the cloud allows small labs the ability to use the data or software without access to institutional compute clusters. Standardization of the computational methodologies for analysis and quality control leads to comparable results from different ENCODE collections - a prerequisite for successful integrative analyses.

2.
bioRxiv ; 2023 Apr 06.
Artículo en Inglés | MEDLINE | ID: mdl-37066421

RESUMEN

The Encyclopedia of DNA elements (ENCODE) project is a collaborative effort to create a comprehensive catalog of functional elements in the human genome. The current database comprises more than 19000 functional genomics experiments across more than 1000 cell lines and tissues using a wide array of experimental techniques to study the chromatin structure, regulatory and transcriptional landscape of the Homo sapiens and Mus musculus genomes. All experimental data, metadata, and associated computational analyses created by the ENCODE consortium are submitted to the Data Coordination Center (DCC) for validation, tracking, storage, and distribution to community resources and the scientific community. The ENCODE project has engineered and distributed uniform processing pipelines in order to promote data provenance and reproducibility as well as allow interoperability between genomic resources and other consortia. All data files, reference genome versions, software versions, and parameters used by the pipelines are captured and available via the ENCODE Portal. The pipeline code, developed using Docker and Workflow Description Language (WDL; https://openwdl.org/) is publicly available in GitHub, with images available on Dockerhub (https://hub.docker.com), enabling access to a diverse range of biomedical researchers. ENCODE pipelines maintained and used by the DCC can be installed to run on personal computers, local HPC clusters, or in cloud computing environments via Cromwell. Access to the pipelines and data via the cloud allows small labs the ability to use the data or software without access to institutional compute clusters. Standardization of the computational methodologies for analysis and quality control leads to comparable results from different ENCODE collections - a prerequisite for successful integrative analyses.

3.
Genome Biol ; 24(1): 79, 2023 04 18.
Artículo en Inglés | MEDLINE | ID: mdl-37072822

RESUMEN

A promising alternative to comprehensively performing genomics experiments is to, instead, perform a subset of experiments and use computational methods to impute the remainder. However, identifying the best imputation methods and what measures meaningfully evaluate performance are open questions. We address these questions by comprehensively analyzing 23 methods from the ENCODE Imputation Challenge. We find that imputation evaluations are challenging and confounded by distributional shifts from differences in data collection and processing over time, the amount of available data, and redundancy among performance measures. Our analyses suggest simple steps for overcoming these issues and promising directions for more robust research.


Asunto(s)
Algoritmos , Epigenómica , Genómica/métodos
4.
Stem Cell Reports ; 16(4): 717-726, 2021 04 13.
Artículo en Inglés | MEDLINE | ID: mdl-33770495

RESUMEN

T cell development is restricted to the thymus and is dependent on high levels of Notch signaling induced within the thymic microenvironment. To understand Notch function in thymic restriction, we investigated the basis for target gene selectivity in response to quantitative differences in Notch signal strength, focusing on the chromatin architecture of genes essential for T cell differentiation. We find that high Notch signal strength is required to activate promoters of known targets essential for T cell commitment, including Il2ra, Cd3ε, and Rag1, which feature low CpG content (LCG) and DNA inaccessibility in hematopoietic stem progenitor cells. Our findings suggest that promoter DNA inaccessibility at LCG T lineage genes provides robust protection against stochastic activation in inappropriate Notch signaling contexts, limiting T cell development to the thymus.


Asunto(s)
Islas de CpG/genética , Regiones Promotoras Genéticas/genética , Receptores Notch/metabolismo , Transducción de Señal , Linfocitos T/metabolismo , Animales , ADN/metabolismo , Desoxirribonucleasa I/metabolismo , Ratones Endogámicos C57BL
5.
Nature ; 584(7820): 244-251, 2020 08.
Artículo en Inglés | MEDLINE | ID: mdl-32728217

RESUMEN

DNase I hypersensitive sites (DHSs) are generic markers of regulatory DNA1-5 and contain genetic variations associated with diseases and phenotypic traits6-8. We created high-resolution maps of DHSs from 733 human biosamples encompassing 438 cell and tissue types and states, and integrated these to delineate and numerically index approximately 3.6 million DHSs within the human genome sequence, providing a common coordinate system for regulatory DNA. Here we show that these maps highly resolve the cis-regulatory compartment of the human genome, which encodes unexpectedly diverse cell- and tissue-selective regulatory programs at very high density. These programs can be captured comprehensively by a simple vocabulary that enables the assignment to each DHS of a regulatory barcode that encapsulates its tissue manifestations, and global annotation of protein-coding and non-coding RNA genes in a manner orthogonal to gene expression. Finally, we show that sharply resolved DHSs markedly enhance the genetic association and heritability signals of diseases and traits. Rather than being confined to a small number of distal elements or promoters, we find that genetic signals converge on congruently regulated sets of DHSs that decorate entire gene bodies. Together, our results create a universal, extensible coordinate system and vocabulary for human regulatory DNA marked by DHSs, and provide a new global perspective on the architecture of human gene regulation.


Asunto(s)
Cromatina/genética , ADN/metabolismo , Desoxirribonucleasa I/metabolismo , Anotación de Secuencia Molecular , Cromatina/química , Cromatina/metabolismo , ADN/química , ADN/genética , Regulación de la Expresión Génica , Genes/genética , Genoma Humano/genética , Humanos , Regiones Promotoras Genéticas/genética , Secuencias Reguladoras de Ácidos Nucleicos/genética
6.
Nature ; 583(7818): 729-736, 2020 07.
Artículo en Inglés | MEDLINE | ID: mdl-32728250

RESUMEN

Combinatorial binding of transcription factors to regulatory DNA underpins gene regulation in all organisms. Genetic variation in regulatory regions has been connected with diseases and diverse phenotypic traits1, but it remains challenging to distinguish variants that affect regulatory function2. Genomic DNase I footprinting enables the quantitative, nucleotide-resolution delineation of sites of transcription factor occupancy within native chromatin3-6. However, only a small fraction of such sites have been precisely resolved on the human genome sequence6. Here, to enable comprehensive mapping of transcription factor footprints, we produced high-density DNase I cleavage maps from 243 human cell and tissue types and states and integrated these data to delineate about 4.5 million compact genomic elements that encode transcription factor occupancy at nucleotide resolution. We map the fine-scale structure within about 1.6 million DNase I-hypersensitive sites and show that the overwhelming majority are populated by well-spaced sites of single transcription factor-DNA interaction. Cell-context-dependent cis-regulation is chiefly executed by wholesale modulation of accessibility at regulatory DNA rather than by differential transcription factor occupancy within accessible elements. We also show that the enrichment of genetic variants associated with diseases or phenotypic traits in regulatory regions1,7 is almost entirely attributable to variants within footprints, and that functional variants that affect transcription factor occupancy are nearly evenly partitioned between loss- and gain-of-function alleles. Unexpectedly, we find increased density of human genetic variation within transcription factor footprints, revealing an unappreciated driver of cis-regulatory evolution. Our results provide a framework for both global and nucleotide-precision analyses of gene regulatory mechanisms and functional genetic variation.


Asunto(s)
Huella de ADN/normas , Genoma Humano/genética , Factores de Transcripción/metabolismo , Secuencia de Consenso , ADN/genética , ADN/metabolismo , Desoxirribonucleasa I/metabolismo , Genética de Población , Estudio de Asociación del Genoma Completo , Humanos , Modelos Moleculares , Polimorfismo de Nucleótido Simple , Secuencias Reguladoras de Ácidos Nucleicos/genética
8.
Cell Rep ; 31(8): 107676, 2020 05 26.
Artículo en Inglés | MEDLINE | ID: mdl-32460018

RESUMEN

The human genome encodes millions of regulatory elements, of which only a small fraction are active within a given cell type. Little is known about the global impact of chromatin remodelers on regulatory DNA landscapes and how this translates to gene expression. We use precision genome engineering to reawaken homozygously inactivated SMARCA4, a central ATPase of the human SWI/SNF chromatin remodeling complex, in lung adenocarcinoma cells. Here, we combine DNase I hypersensitivity, histone modification, and transcriptional profiling to show that SMARCA4 dramatically increases both the number and magnitude of accessible chromatin sites genome-wide, chiefly by unmasking sites of low regulatory factor occupancy. By contrast, transcriptional changes are concentrated within well-demarcated remodeling domains wherein expression of specific genes is gated by both distal element activation and promoter chromatin configuration. Our results provide a perspective on how global chromatin remodeling activity is translated to gene expression via regulatory DNA.


Asunto(s)
Ensamble y Desensamble de Cromatina/genética , ADN Helicasas/metabolismo , ADN/genética , Expresión Génica/genética , Proteínas Nucleares/metabolismo , Factores de Transcripción/metabolismo , Humanos
9.
Front Plant Sci ; 10: 1434, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-31798605

RESUMEN

The genome is reprogrammed during development to produce diverse cell types, largely through altered expression and activity of key transcription factors. The accessibility and critical functions of epidermal cells have made them a model for connecting transcriptional events to development in a range of model systems. In Arabidopsis thaliana and many other plants, fertilization triggers differentiation of specialized epidermal seed coat cells that have a unique morphology caused by large extracellular deposits of polysaccharides. Here, we used DNase I-seq to generate regulatory landscapes of A. thaliana seeds at two critical time points in seed coat maturation (4 and 7 DPA), enriching for seed coat cells with the INTACT method. We found over 3,000 developmentally dynamic regulatory DNA elements and explored their relationship with nearby gene expression. The dynamic regulatory elements were enriched for motifs for several transcription factors families; most notably the TCP family at the earlier time point and the MYB family at the later one. To assess the extent to which the observed regulatory sites in seeds added to previously known regulatory sites in A. thaliana, we compared our data to 11 other data sets generated with 7-day-old seedlings for diverse tissues and conditions. Surprisingly, over a quarter of the regulatory, i.e. accessible, bases observed in seeds were novel. Notably, plant regulatory landscapes from different tissues, cell types, or developmental stages were more dynamic than those generated from bulk tissue in response to environmental perturbations, highlighting the importance of extending studies of regulatory DNA to single tissues and cell types during development.

10.
EBioMedicine ; 41: 427-442, 2019 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-30827930

RESUMEN

BACKGROUND: Transcriptional dysregulation drives cancer formation but the underlying mechanisms are still poorly understood. Renal cell carcinoma (RCC) is the most common malignant kidney tumor which canonically activates the hypoxia-inducible transcription factor (HIF) pathway. Despite intensive study, novel therapeutic strategies to target RCC have been difficult to develop. Since the RCC epigenome is relatively understudied, we sought to elucidate key mechanisms underpinning the tumor phenotype and its clinical behavior. METHODS: We performed genome-wide chromatin accessibility (DNase-seq) and transcriptome profiling (RNA-seq) on paired tumor/normal samples from 3 patients undergoing nephrectomy for removal of RCC. We incorporated publicly available data on HIF binding (ChIP-seq) in a RCC cell line. We performed integrated analyses of these high-resolution, genome-scale datasets together with larger transcriptomic data available through The Cancer Genome Atlas (TCGA). FINDINGS: Though HIF transcription factors play a cardinal role in RCC oncogenesis, we found that numerous transcription factors with a RCC-selective expression pattern also demonstrated evidence of HIF binding near their gene body. Examination of chromatin accessibility profiles revealed that some of these transcription factors influenced the tumor's regulatory landscape, notably the stem cell transcription factor POU5F1 (OCT4). Elevated POU5F1 transcript levels were correlated with advanced tumor stage and poorer overall survival in RCC patients. Unexpectedly, we discovered a HIF-pathway-responsive promoter embedded within a endogenous retroviral long terminal repeat (LTR) element at the transcriptional start site of the PSOR1C3 long non-coding RNA gene upstream of POU5F1. RNA transcripts are induced from this promoter and read through PSOR1C3 into POU5F1 producing a novel POU5F1 transcript isoform. Rather than being unique to the POU5F1 locus, we found that HIF binds to several other transcriptionally active LTR elements genome-wide correlating with broad gene expression changes in RCC. INTERPRETATION: Integrated transcriptomic and epigenomic analysis of matched tumor and normal tissues from even a small number of primary patient samples revealed remarkably convergent shared regulatory landscapes. Several transcription factors appear to act downstream of HIF including the potent stem cell transcription factor POU5F1. Dysregulated expression of POU5F1 is part of a larger pattern of gene expression changes in RCC that may be induced by HIF-dependent reactivation of dormant promoters embedded within endogenous retroviral LTRs.


Asunto(s)
Retrovirus Endógenos/genética , Epigenómica , Factores de Transcripción con Motivo Hélice-Asa-Hélice Básico/genética , Sitios de Unión , Carcinoma de Células Renales/genética , Carcinoma de Células Renales/mortalidad , Carcinoma de Células Renales/patología , Línea Celular Tumoral , Reductasas del Citocromo/genética , Retrovirus Endógenos/fisiología , Regulación Neoplásica de la Expresión Génica , Humanos , Factor 1 Inducible por Hipoxia/genética , Neoplasias Renales/genética , Neoplasias Renales/mortalidad , Neoplasias Renales/patología , Factor 3 de Transcripción de Unión a Octámeros/genética , Factor 3 de Transcripción de Unión a Octámeros/metabolismo , Oxidorreductasas actuantes sobre Donantes de Grupos Sulfuro , Hidrolasas Diéster Fosfóricas/genética , Regiones Promotoras Genéticas , Proteínas/genética , Pirofosfatasas/genética , ARN Largo no Codificante , Tasa de Supervivencia , Secuencias Repetidas Terminales/genética , Enzimas Ubiquitina-Conjugadoras/genética
11.
J Am Soc Nephrol ; 30(3): 421-441, 2019 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-30760496

RESUMEN

BACKGROUND: Linking genetic risk loci identified by genome-wide association studies (GWAS) to their causal genes remains a major challenge. Disease-associated genetic variants are concentrated in regions containing regulatory DNA elements, such as promoters and enhancers. Although researchers have previously published DNA maps of these regulatory regions for kidney tubule cells and glomerular endothelial cells, maps for podocytes and mesangial cells have not been available. METHODS: We generated regulatory DNA maps (DNase-seq) and paired gene expression profiles (RNA-seq) from primary outgrowth cultures of human glomeruli that were composed mainly of podocytes and mesangial cells. We generated similar datasets from renal cortex cultures, to compare with those of the glomerular cultures. Because regulatory DNA elements can act on target genes across large genomic distances, we also generated a chromatin conformation map from freshly isolated human glomeruli. RESULTS: We identified thousands of unique regulatory DNA elements, many located close to transcription factor genes, which the glomerular and cortex samples expressed at different levels. We found that genetic variants associated with kidney diseases (GWAS) and kidney expression quantitative trait loci were enriched in regulatory DNA regions. By combining GWAS, epigenomic, and chromatin conformation data, we functionally annotated 46 kidney disease genes. CONCLUSIONS: We demonstrate a powerful approach to functionally connect kidney disease-/trait-associated loci to their target genes by leveraging unique regulatory DNA maps and integrated epigenomic and genetic analysis. This process can be applied to other kidney cell types and will enhance our understanding of genome regulation and its effects on gene expression in kidney disease.

12.
Genome Biol ; 18(1): 49, 2017 03 09.
Artículo en Inglés | MEDLINE | ID: mdl-28279197

RESUMEN

BACKGROUND: Gene innovation by duplication is a fundamental evolutionary process but is difficult to study in humans due to the large size, high sequence identity, and mosaic nature of segmental duplication blocks. The human-specific gene hydrocephalus-inducing 2, HYDIN2, was generated by a 364 kbp duplication of 79 internal exons of the large ciliary gene HYDIN from chromosome 16q22.2 to chromosome 1q21.1. Because the HYDIN2 locus lacks the ancestral promoter and seven terminal exons of the progenitor gene, we sought to characterize transcription at this locus by coupling reverse transcription polymerase chain reaction and long-read sequencing. RESULTS: 5' RACE indicates a transcription start site for HYDIN2 outside of the duplication and we observe fusion transcripts spanning both the 5' and 3' breakpoints. We observe extensive splicing diversity leading to the formation of altered open reading frames (ORFs) that appear to be under relaxed selection. We show that HYDIN2 adopted a new promoter that drives an altered pattern of expression, with highest levels in neural tissues. We estimate that the HYDIN duplication occurred ~3.2 million years ago and find that it is nearly fixed (99.9%) for diploid copy number in contemporary humans. Examination of 73 chromosome 1q21 rearrangement patients reveals that HYDIN2 is deleted or duplicated in most cases. CONCLUSIONS: Together, these data support a model of rapid gene innovation by fusion of incomplete segmental duplications, altered tissue expression, and potential subfunctionalization or neofunctionalization of HYDIN2 early in the evolution of the Homo lineage.


Asunto(s)
Duplicación de Gen , Fusión Génica , Neuronas/metabolismo , Aberraciones Cromosómicas , Puntos de Rotura del Cromosoma , Trastornos de los Cromosomas/genética , Cromosomas Humanos Par 1 , Variaciones en el Número de Copia de ADN , Evolución Molecular , Conversión Génica , Perfilación de la Expresión Génica , Variación Genética , Genética de Población , Genómica/métodos , Humanos , Sistemas de Lectura Abierta , Especificidad de Órganos/genética , Fenotipo , Selección Genética , Transcripción Genética
13.
Neuroepigenetics ; 6: 10-25, 2016 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-27429906

RESUMEN

Neural stem progenitor cells (NSPCs) in the human subventricular zone (SVZ) potentially contribute to life-long neurogenesis, yet subtypes of glioblastoma multiforme (GBM) contain NSPC signatures that highlight the importance of cell fate regulation. Among numerous regulatory mechanisms, the post-translational methylations onto histone tails are crucial regulator of cell fate. The work presented here focuses on the role of two repressive chromatin marks tri-methylations on histone H3 lysine 27 (H3K27me3) and histone H4 lysine 20 (H4K20me3) in the adult NSPC within the SVZ. To best model healthy human NSPCs as they exist in vivo for epigenetic profiling of H3K27me3 and H4K20me3, we utilized NSPCs isolated from the adult SVZ of baboon brain (Papio anubis) with brain structure and genomic level similar to human. The putative role of H3K27me3 in normal NSPCs predominantly falls into the regulation of gene expression, cell cycle, and differentiation, whereas H4K20me3 is involved in DNA replication/repair, metabolism, and cell cycle. Using conditional knock-out mouse models to diminish Ezh2 and Suv4-20h responsible for H3K27me3 and H4K20me3, respectively, we found that both repressive marks have irrefutable function for cell cycle regulation in the NSPC population. While both EZH2/H3K27me3 and Suv4-20h/H4K20me3 have implication in cancers, our comparative genomics approach between healthy NSPCs and human GBM specimens revealed that substantial sets of genes enriched with H3K27me3 and H4K20me3 in the NSPCs are altered in the human GBM. In sum, our integrated analyses across species highlight important roles of H3K27me3 and H4K20me3 in normal and disease conditions in the context of NSPC.

14.
Am J Hum Genet ; 98(1): 58-74, 2016 Jan 07.
Artículo en Inglés | MEDLINE | ID: mdl-26749308

RESUMEN

We performed whole-genome sequencing (WGS) of 208 genomes from 53 families affected by simplex autism. For the majority of these families, no copy-number variant (CNV) or candidate de novo gene-disruptive single-nucleotide variant (SNV) had been detected by microarray or whole-exome sequencing (WES). We integrated multiple CNV and SNV analyses and extensive experimental validation to identify additional candidate mutations in eight families. We report that compared to control individuals, probands showed a significant (p = 0.03) enrichment of de novo and private disruptive mutations within fetal CNS DNase I hypersensitive sites (i.e., putative regulatory regions). This effect was only observed within 50 kb of genes that have been previously associated with autism risk, including genes where dosage sensitivity has already been established by recurrent disruptive de novo protein-coding mutations (ARID1B, SCN2A, NR3C2, PRKCA, and DSCAM). In addition, we provide evidence of gene-disruptive CNVs (in DISC1, WNT7A, RBFOX1, and MBD5), as well as smaller de novo CNVs and exon-specific SNVs missed by exome sequencing in neurodevelopmental genes (e.g., CANX, SAE1, and PIK3CA). Our results suggest that the detection of smaller, often multiple CNVs affecting putative regulatory elements might help explain additional risk of simplex autism.


Asunto(s)
Trastorno Autístico/genética , ADN/genética , Genoma Humano , Exoma , Femenino , Humanos , Masculino , Linaje , Polimorfismo de Nucleótido Simple
17.
Nat Genet ; 47(12): 1393-401, 2015 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-26502339

RESUMEN

The function of human regulatory regions depends exquisitely on their local genomic environment and on cellular context, complicating experimental analysis of common disease- and trait-associated variants that localize within regulatory DNA. We use allelically resolved genomic DNase I footprinting data encompassing 166 individuals and 114 cell types to identify >60,000 common variants that directly influence transcription factor occupancy and regulatory DNA accessibility in vivo. The unprecedented scale of these data enables systematic analysis of the impact of sequence variation on transcription factor occupancy in vivo. We leverage this analysis to develop accurate models of variation affecting the recognition sites for diverse transcription factors and apply these models to discriminate nearly 500,000 common regulatory variants likely to affect transcription factor occupancy across the human genome. The approach and results provide a new foundation for the analysis and interpretation of noncoding variation in complete human genomes and for systems-level investigation of disease-associated variants.


Asunto(s)
Cromatina/metabolismo , Regulación de la Expresión Génica , Variación Genética/genética , Polimorfismo de Nucleótido Simple/genética , Regiones Promotoras Genéticas/genética , Elementos Reguladores de la Transcripción/genética , Factores de Transcripción/metabolismo , Genoma Humano , Genómica/métodos , Humanos , Fenotipo , Unión Proteica , Factores de Transcripción/genética
18.
Artículo en Inglés | MEDLINE | ID: mdl-25972927

RESUMEN

BACKGROUND: The brain, spinal cord, and neural retina comprise the central nervous system (CNS) of vertebrates. Understanding the regulatory mechanisms that underlie the enormous cell-type diversity of the CNS is a significant challenge. Whole-genome mapping of DNase I-hypersensitive sites (DHSs) has been used to identify cis-regulatory elements in many tissues. We have applied this approach to the mouse CNS, including developing and mature neural retina, whole brain, and two well-characterized brain regions, the cerebellum and the cerebral cortex. RESULTS: For the various regions and developmental stages of the CNS that we analyzed, there were approximately the same number of DHSs; however, there were many DHSs unique to each CNS region and developmental stage. Many of the DHSs are likely to mark enhancers that are specific to the specific CNS region and developmental stage. We validated the DNase I mapping approach for identification of CNS enhancers using the existing VISTA Browser database and with in vivo and in vitro electroporation of the retina. Analysis of transcription factor consensus sites within the DHSs shows distinct region-specific profiles of transcriptional regulators particular to each region. Clustering developmentally dynamic DHSs in the retina revealed enrichment of developmental stage-specific transcriptional regulators. Additionally, we found reporter gene activity in the retina driven from several previously uncharacterized regulatory elements surrounding the neurodevelopmental gene Otx2. Identification of DHSs shared between mouse and human showed region-specific differences in the evolution of cis-regulatory elements. CONCLUSIONS: Overall, our results demonstrate the potential of genome-wide DNase I mapping to cis-regulatory questions regarding the regional diversity within the CNS. These data represent an extensive catalogue of potential cis-regulatory elements within the CNS that display region and temporal specificity, as well as a set of DHSs common to CNS tissues. Further examination of evolutionary conservation of DHSs between CNS regions and different species may reveal important cis-regulatory elements in the evolution of the mammalian CNS.

19.
Cell ; 161(3): 541-554, 2015 Apr 23.
Artículo en Inglés | MEDLINE | ID: mdl-25910208

RESUMEN

Major features of transcription by human RNA polymerase II (Pol II) remain poorly defined due to a lack of quantitative approaches for visualizing Pol II progress at nucleotide resolution. We developed a simple and powerful approach for performing native elongating transcript sequencing (NET-seq) in human cells that globally maps strand-specific Pol II density at nucleotide resolution. NET-seq exposes a mode of antisense transcription that originates downstream and converges on transcription from the canonical promoter. Convergent transcription is associated with a distinctive chromatin configuration and is characteristic of lower-expressed genes. Integration of NET-seq with genomic footprinting data reveals stereotypic Pol II pausing coincident with transcription factor occupancy. Finally, exons retained in mature transcripts display Pol II pausing signatures that differ markedly from skipped exons, indicating an intrinsic capacity for Pol II to recognize exons with different processing fates. Together, human NET-seq exposes the topography and regulatory complexity of human gene expression.


Asunto(s)
ARN Polimerasa II/metabolismo , Elongación de la Transcripción Genética , Empalme Alternativo , Elementos de Facilitación Genéticos , Exones , Células HeLa , Humanos , Regiones Promotoras Genéticas , ARN sin Sentido/genética , Análisis de Secuencia de ARN/métodos , Factores de Transcripción/metabolismo , Transcripción Genética
20.
BMC Genomics ; 16: 87, 2015 Feb 14.
Artículo en Inglés | MEDLINE | ID: mdl-25765714

RESUMEN

BACKGROUND: Because species-specific gene expression is driven by species-specific regulation, understanding the relationship between sequence and function of the regulatory regions in different species will help elucidate how differences among species arise. Despite active experimental and computational research, relationships among sequence, conservation, and function are still poorly understood. RESULTS: We compared transcription factor occupied segments (TFos) for 116 human and 35 mouse TFs in 546 human and 125 mouse cell types and tissues from the Human and the Mouse ENCODE projects. We based the map between human and mouse TFos on a one-to-one nucleotide cross-species mapper, bnMapper, that utilizes whole genome alignments (WGA). Our analysis shows that TFos are under evolutionary constraint, but a substantial portion (25.1% of mouse and 25.85% of human on average) of the TFos does not have a homologous sequence on the other species; this portion varies among cell types and TFs. Furthermore, 47.67% and 57.01% of the homologous TFos sequence shows binding activity on the other species for human and mouse respectively. However, 79.87% and 69.22% is repurposed such that it binds the same TF in different cells or different TFs in the same cells. Remarkably, within the set of repurposed TFos, the corresponding genome regions in the other species are preferred locations of novel TFos. These events suggest exaptation of some functional regulatory sequences into new function. Despite TFos repurposing, we did not find substantial changes in their predicted target genes, suggesting that CRMs buffer evolutionary events allowing little or no change in the TFos - target gene associations. Thus, the small portion of TFos with strictly conserved occupancy underestimates the degree of conservation of regulatory interactions. CONCLUSION: We mapped regulatory sequences from an extensive number of TFs and cell types between human and mouse using WGA. A comparative analysis of this correspondence unveiled the extent of the shared regulatory sequence across TFs and cell types under study. Importantly, a large part of the shared regulatory sequence is repurposed on the other species. This sequence, fueled by turnover events, provides a strong case for exaptation in regulatory elements.


Asunto(s)
Evolución Biológica , Genoma , Factores de Transcripción/genética , Animales , Sitios de Unión , Hibridación Genómica Comparativa , Humanos , Ratones , Secuencias Reguladoras de Ácidos Nucleicos/genética , Factores de Transcripción/metabolismo
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA