RESUMEN
Constitutive heterochromatin is traditionally viewed as the static form of heterochromatin that silences pericentromeric and telomeric repeats in a cell cycle- and differentiation-independent manner. Here, we show that, in the mouse olfactory epithelium, olfactory receptor (OR) genes are marked in a highly dynamic fashion with the molecular hallmarks of constitutive heterochromatin, H3K9me3 and H4K20me3. The cell type and developmentally dependent deposition of these marks along the OR clusters are, most likely, reversed during the process of OR choice to allow for monogenic and monoallelic OR expression. In contrast to the current view of OR choice, our data suggest that OR silencing takes place before OR expression, indicating that it is not the product of an OR-elicited feedback signal. Our findings suggest that chromatin-mediated silencing lays a molecular foundation upon which singular and stochastic selection for gene expression can be applied.
Asunto(s)
Ensamble y Desensamble de Cromatina , Silenciador del Gen , Mucosa Olfatoria/metabolismo , Receptores Odorantes/genética , Animales , Inmunoprecipitación de Cromatina , Expresión Génica , Heterocromatina , Código de Histonas , Ratones , Ratones Endogámicos C57BL , Ratones Transgénicos , Análisis de Secuencia por Matrices de OligonucleótidosRESUMEN
We sought to better understand the immune response during the immediate post-diagnosis phase of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) by identifying molecular associations with longitudinal disease outcomes. Multi-omic analyses identified differences in immune cell composition, cytokine levels, and cell subset-specific transcriptomic and epigenomic signatures between individuals on a more serious disease trajectory (Progressors) as compared to those on a milder course (Non-progressors). Higher levels of multiple cytokines were observed in Progressors, with IL-6 showing the largest difference. Blood monocyte cell subsets were also skewed, showing a comparative decrease in non-classical CD14-CD16+ and intermediate CD14+CD16+ monocytes. In lymphocytes, the CD8+ T effector memory cells displayed a gene expression signature consistent with stronger T cell activation in Progressors. These early stage observations could serve as the basis for the development of prognostic biomarkers of disease risk and interventional strategies to improve the management of severe COVID-19. BACKGROUND: Much of the literature on immune response post-SARS-CoV-2 infection has been in the acute and post-acute phases of infection. TRANSLATIONAL SIGNIFICANCE: We found differences at early time points of infection in approximately 160 participants. We compared multi-omic signatures in immune cells between individuals progressing to needing more significant medical intervention and non-progressors. We observed widespread evidence of a state of increased inflammation associated with progression, supported by a range of epigenomic, transcriptomic, and proteomic signatures. The signatures we identified support other findings at later time points and serve as the basis for prognostic biomarker development or to inform interventional strategies.
Asunto(s)
COVID-19 , Humanos , Multiómica , Proteómica , SARS-CoV-2 , CitocinasRESUMEN
The reference human genome sequence set the stage for studies of genetic variation and its association with human disease, but epigenomic studies lack a similar reference. To address this need, the NIH Roadmap Epigenomics Consortium generated the largest collection so far of human epigenomes for primary cells and tissues. Here we describe the integrative analysis of 111 reference human epigenomes generated as part of the programme, profiled for histone modification patterns, DNA accessibility, DNA methylation and RNA expression. We establish global maps of regulatory elements, define regulatory modules of coordinated activity, and their likely activators and repressors. We show that disease- and trait-associated genetic variants are enriched in tissue-specific epigenomic marks, revealing biologically relevant cell types for diverse human traits, and providing a resource for interpreting the molecular basis of human disease. Our results demonstrate the central role of epigenomic information for understanding gene regulation, cellular differentiation and human disease.
Asunto(s)
Epigénesis Genética/genética , Epigenómica , Genoma Humano/genética , Secuencia de Bases , Linaje de la Célula/genética , Células Cultivadas , Cromatina/química , Cromatina/genética , Cromatina/metabolismo , Cromosomas Humanos/química , Cromosomas Humanos/genética , Cromosomas Humanos/metabolismo , ADN/química , ADN/genética , ADN/metabolismo , Metilación de ADN , Conjuntos de Datos como Asunto , Elementos de Facilitación Genéticos/genética , Variación Genética/genética , Estudio de Asociación del Genoma Completo , Histonas/metabolismo , Humanos , Especificidad de Órganos/genética , ARN/genética , Valores de ReferenciaRESUMEN
Despite the large evolutionary distances between metazoan species, they can show remarkable commonalities in their biology, and this has helped to establish fly and worm as model organisms for human biology. Although studies of individual elements and factors have explored similarities in gene regulation, a large-scale comparative analysis of basic principles of transcriptional regulatory features is lacking. Here we map the genome-wide binding locations of 165 human, 93 worm and 52 fly transcription regulatory factors, generating a total of 1,019 data sets from diverse cell types, developmental stages, or conditions in the three species, of which 498 (48.9%) are presented here for the first time. We find that structural properties of regulatory networks are remarkably conserved and that orthologous regulatory factor families recognize similar binding motifs in vivo and show some similar co-associations. Our results suggest that gene-regulatory properties previously observed for individual factors are general principles of metazoan regulation that are remarkably well-preserved despite extensive functional divergence of individual network connections. The comparative maps of regulatory circuitry provided here will drive an improved understanding of the regulatory underpinnings of model organism biology and how these relate to human biology, development and disease.
Asunto(s)
Caenorhabditis elegans/genética , Drosophila melanogaster/genética , Evolución Molecular , Regulación de la Expresión Génica/genética , Redes Reguladoras de Genes/genética , Factores de Transcripción/metabolismo , Animales , Sitios de Unión , Caenorhabditis elegans/crecimiento & desarrollo , Inmunoprecipitación de Cromatina , Secuencia Conservada/genética , Drosophila melanogaster/crecimiento & desarrollo , Regulación del Desarrollo de la Expresión Génica/genética , Genoma/genética , Humanos , Anotación de Secuencia Molecular , Motivos de Nucleótidos/genética , Especificidad de Órganos/genética , Factores de Transcripción/genéticaRESUMEN
Annotation of regulatory elements and identification of the transcription-related factors (TRFs) targeting these elements are key steps in understanding how cells interpret their genetic blueprint and their environment during development, and how that process goes awry in the case of disease. One goal of the modENCODE (model organism ENCyclopedia of DNA Elements) Project is to survey a diverse sampling of TRFs, both DNA-binding and non-DNA-binding factors, to provide a framework for the subsequent study of the mechanisms by which transcriptional regulators target the genome. Here we provide an updated map of the Drosophila melanogaster regulatory genome based on the location of 84 TRFs at various stages of development. This regulatory map reveals a variety of genomic targeting patterns, including factors with strong preferences toward proximal promoter binding, factors that target intergenic and intronic DNA, and factors with distinct chromatin state preferences. The data also highlight the stringency of the Polycomb regulatory network, and show association of the Trithorax-like (Trl) protein with hotspots of DNA binding throughout development. Furthermore, the data identify more than 5800 instances in which TRFs target DNA regions with demonstrated enhancer activity. Regions of high TRF co-occupancy are more likely to be associated with open enhancers used across cell types, while lower TRF occupancy regions are associated with complex enhancers that are also regulated at the epigenetic level. Together these data serve as a resource for the research community in the continued effort to dissect transcriptional regulatory mechanisms directing Drosophila development.
Asunto(s)
Drosophila melanogaster/genética , Drosophila melanogaster/metabolismo , Regulación de la Expresión Génica , Genoma de los Insectos , Factores de Transcripción , Transcripción Genética , Animales , Secuencia de Bases , Sitios de Unión , Cromatina/genética , Cromatina/metabolismo , Análisis por Conglomerados , Biología Computacional/métodos , Elementos de Facilitación Genéticos , Perfilación de la Expresión Génica , Genómica/métodos , Motivos de Nucleótidos , Unión Proteica , Secuencias Reguladoras de Ácidos Nucleicos , Factores de Transcripción/metabolismoRESUMEN
Chromatin profiling has emerged as a powerful means of genome annotation and detection of regulatory activity. The approach is especially well suited to the characterization of non-coding portions of the genome, which critically contribute to cellular phenotypes yet remain largely uncharted. Here we map nine chromatin marks across nine cell types to systematically characterize regulatory elements, their cell-type specificities and their functional interactions. Focusing on cell-type-specific patterns of promoters and enhancers, we define multicell activity profiles for chromatin state, gene expression, regulatory motif enrichment and regulator expression. We use correlations between these profiles to link enhancers to putative target genes, and predict the cell-type-specific activators and repressors that modulate them. The resulting annotations and regulatory predictions have implications for the interpretation of genome-wide association studies. Top-scoring disease single nucleotide polymorphisms are frequently positioned within enhancer elements specifically active in relevant cell types, and in some cases affect a motif instance for a predicted regulator, thus suggesting a mechanism for the association. Our study presents a general framework for deciphering cis-regulatory connections and their roles in disease.
Asunto(s)
Fenómenos Fisiológicos Celulares , Cromatina/genética , Cromatina/metabolismo , Mapeo Cromosómico , Sitios de Unión , Línea Celular , Línea Celular Tumoral , Células Cultivadas , Regulación de la Expresión Génica , Genoma Humano/genética , Células Hep G2 , Humanos , Regiones Promotoras Genéticas/genética , Reproducibilidad de los Resultados , Factores de Transcripción/genéticaRESUMEN
Systematic annotation of gene regulatory elements is a major challenge in genome science. Direct mapping of chromatin modification marks and transcriptional factor binding sites genome-wide has successfully identified specific subtypes of regulatory elements. In Drosophila several pioneering studies have provided genome-wide identification of Polycomb response elements, chromatin states, transcription factor binding sites, RNA polymerase II regulation and insulator elements; however, comprehensive annotation of the regulatory genome remains a significant challenge. Here we describe results from the modENCODE cis-regulatory annotation project. We produced a map of the Drosophila melanogaster regulatory genome on the basis of more than 300 chromatin immunoprecipitation data sets for eight chromatin features, five histone deacetylases and thirty-eight site-specific transcription factors at different stages of development. Using these data we inferred more than 20,000 candidate regulatory elements and validated a subset of predictions for promoters, enhancers and insulators in vivo. We identified also nearly 2,000 genomic regions of dense transcription factor binding associated with chromatin activity and accessibility. We discovered hundreds of new transcription factor co-binding relationships and defined a transcription factor network with over 800 potential regulatory relationships.
Asunto(s)
Drosophila melanogaster/genética , Genoma de los Insectos/genética , Anotación de Secuencia Molecular , Secuencias Reguladoras de Ácidos Nucleicos/genética , Animales , Cromatina/metabolismo , Ensamble y Desensamble de Cromatina , Inmunoprecipitación de Cromatina , Elementos de Facilitación Genéticos/genética , Histona Desacetilasas/metabolismo , Elementos Aisladores/genética , Regiones Promotoras Genéticas/genética , Reproducibilidad de los Resultados , Elementos Silenciadores Transcripcionales/genética , Factores de Transcripción/metabolismoRESUMEN
The comparison of related genomes has emerged as a powerful lens for genome interpretation. Here we report the sequencing and comparative analysis of 29 eutherian genomes. We confirm that at least 5.5% of the human genome has undergone purifying selection, and locate constrained elements covering â¼4.2% of the genome. We use evolutionary signatures and comparisons with experimental data sets to suggest candidate functions for â¼60% of constrained bases. These elements reveal a small number of new coding exons, candidate stop codon readthrough events and over 10,000 regions of overlapping synonymous constraint within protein-coding exons. We find 220 candidate RNA structural families, and nearly a million elements overlapping potential promoter, enhancer and insulator regions. We report specific amino acid residues that have undergone positive selection, 280,000 non-coding elements exapted from mobile elements and more than 1,000 primate- and human-accelerated elements. Overlap with disease-associated variants indicates that our findings will be relevant for studies of human biology, health and disease.
Asunto(s)
Evolución Molecular , Genoma Humano/genética , Genoma/genética , Mamíferos/genética , Animales , Enfermedad , Exones/genética , Genómica , Salud , Humanos , Anotación de Secuencia Molecular , Filogenia , ARN/clasificación , ARN/genética , Selección Genética/genética , Alineación de Secuencia , Análisis de Secuencia de ADNRESUMEN
Genome-wide chromatin annotations have permitted the mapping of putative regulatory elements across multiple human cell types. However, their experimental dissection by directed regulatory motif disruption has remained unfeasible at the genome scale. Here, we use a massively parallel reporter assay (MPRA) to measure the transcriptional levels induced by 145-bp DNA segments centered on evolutionarily conserved regulatory motif instances within enhancer chromatin states. We select five predicted activators (HNF1, HNF4, FOXA, GATA, NFE2L2) and two predicted repressors (GFI1, ZFP161) and measure reporter expression in erythroleukemia (K562) and liver carcinoma (HepG2) cell lines. We test 2104 wild-type sequences and 3314 engineered enhancer variants containing targeted motif disruptions, each using 10 barcode tags and two replicates. The resulting data strongly confirm the enhancer activity and cell-type specificity of enhancer chromatin states, the ability of 145-bp segments to recapitulate both, the necessary role of regulatory motifs in enhancer function, and the complementary roles of activator and repressor motifs. We find statistically robust evidence that (1) disrupting the predicted activator motifs abolishes enhancer function, while silent or motif-improving changes maintain enhancer activity; (2) evolutionary conservation, nucleosome exclusion, binding of other factors, and strength of the motif match are predictive of enhancer activity; (3) scrambling repressor motifs leads to aberrant reporter expression in cell lines where the enhancers are usually inactive. Our results suggest a general strategy for deciphering cis-regulatory elements by systematic large-scale manipulation and provide quantitative enhancer activity measurements across thousands of constructs that can be mined to develop predictive models of gene expression.
Asunto(s)
Cromatina/genética , Elementos de Facilitación Genéticos , Motivos de Nucleótidos/genética , Transcripción Genética , Secuencia de Bases , Sitios de Unión , Células/clasificación , Células/metabolismo , Mapeo Cromosómico , Secuencia Conservada , Regulación de la Expresión Génica , Genes Reporteros , Genoma Humano , Células Hep G2 , Humanos , Regiones Promotoras GenéticasRESUMEN
Recent advances in technology have led to a dramatic increase in the number of available transcription factor ChIP-seq and ChIP-chip data sets. Understanding the motif content of these data sets is an important step in understanding the underlying mechanisms of regulation. Here we provide a systematic motif analysis for 427 human ChIP-seq data sets using motifs curated from the literature and also discovered de novo using five established motif discovery tools. We use a systematic pipeline for calculating motif enrichment in each data set, providing a principled way for choosing between motif variants found in the literature and for flagging potentially problematic data sets. Our analysis confirms the known specificity of 41 of the 56 analyzed factor groups and reveals motifs of potential cofactors. We also use cell type-specific binding to find factors active in specific conditions. The resource we provide is accessible both for browsing a small number of factors and for performing large-scale systematic analyses. We provide motif matrices, instances and enrichments in each of the ENCODE data sets. The motifs discovered here have been used in parallel studies to validate the specificity of antibodies, understand cooperativity between data sets and measure the variation of motif binding across individuals and species.
Asunto(s)
Elementos Reguladores de la Transcripción , Factores de Transcripción/metabolismo , Sitios de Unión , Línea Celular , Inmunoprecipitación de Cromatina , Evolución Molecular , Humanos , Motivos de Nucleótidos , Análisis de Secuencia de ADNRESUMEN
Chromatin immunoprecipitation (ChIP) followed by high-throughput DNA sequencing (ChIP-seq) has become a valuable and widely used approach for mapping the genomic location of transcription-factor binding and histone modifications in living cells. Despite its widespread use, there are considerable differences in how these experiments are conducted, how the results are scored and evaluated for quality, and how the data and metadata are archived for public use. These practices affect the quality and utility of any global ChIP experiment. Through our experience in performing ChIP-seq experiments, the ENCODE and modENCODE consortia have developed a set of working standards and guidelines for ChIP experiments that are updated routinely. The current guidelines address antibody validation, experimental replication, sequencing depth, data and metadata reporting, and data quality assessment. We discuss how ChIP quality, assessed in these ways, affects different uses of ChIP-seq data. All data sets used in the analysis have been deposited for public viewing and downloading at the ENCODE (http://encodeproject.org/ENCODE/) and modENCODE (http://www.modencode.org/) portals.
Asunto(s)
Inmunoprecipitación de Cromatina/métodos , Bases de Datos Genéticas , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Animales , Genoma/genética , Genómica/métodos , Guías como Asunto , Histonas/metabolismo , Humanos , Internet , Factores de Transcripción/metabolismoRESUMEN
The human body is composed of diverse cell types with distinct functions. Although it is known that lineage specification depends on cell-specific gene expression, which in turn is driven by promoters, enhancers, insulators and other cis-regulatory DNA sequences for each gene, the relative roles of these regulatory elements in this process are not clear. We have previously developed a chromatin-immunoprecipitation-based microarray method (ChIP-chip) to locate promoters, enhancers and insulators in the human genome. Here we use the same approach to identify these elements in multiple cell types and investigate their roles in cell-type-specific gene expression. We observed that the chromatin state at promoters and CTCF-binding at insulators is largely invariant across diverse cell types. In contrast, enhancers are marked with highly cell-type-specific histone modification patterns, strongly correlate to cell-type-specific gene expression programs on a global scale, and are functionally active in a cell-type-specific manner. Our results define over 55,000 potential transcriptional enhancers in the human genome, significantly expanding the current catalogue of human enhancers and highlighting the role of these elements in cell-type-specific gene expression.
Asunto(s)
Fenómenos Fisiológicos Celulares , Regulación de la Expresión Génica , Histonas/metabolismo , Factores de Transcripción/genética , Sitios de Unión , Línea Celular , Cromatina/genética , Genoma Humano/genética , Células HeLa , Humanos , Células K562 , Regiones Promotoras Genéticas/genética , Factores de Transcripción/metabolismoRESUMEN
The degeneracy of the genetic code allows protein-coding DNA and RNA sequences to simultaneously encode additional, overlapping functional elements. A sequence in which both protein-coding and additional overlapping functions have evolved under purifying selection should show increased evolutionary conservation compared to typical protein-coding genes--especially at synonymous sites. In this study, we use genome alignments of 29 placental mammals to systematically locate short regions within human ORFs that show conspicuously low estimated rates of synonymous substitution across these species. The 29-species alignment provides statistical power to locate more than 10,000 such regions with resolution down to nine-codon windows, which are found within more than a quarter of all human protein-coding genes and contain â¼2% of their synonymous sites. We collect numerous lines of evidence that the observed synonymous constraint in these regions reflects selection on overlapping functional elements including splicing regulatory elements, dual-coding genes, RNA secondary structures, microRNA target sites, and developmental enhancers. Our results show that overlapping functional elements are common in mammalian genes, despite the vast genomic landscape.
Asunto(s)
Genoma , Mamíferos/genética , Sistemas de Lectura Abierta/genética , Selección Genética , Animales , Composición de Base , Secuencia de Bases , Codón , Codón Iniciador , Biología Computacional , Secuencia Conservada , Elementos de Facilitación Genéticos , Exones , Orden Génico , Genes BRCA1 , Proteínas de Homeodominio/genética , Humanos , MicroARNs/metabolismo , Datos de Secuencia Molecular , Tasa de Mutación , Conformación de Ácido Nucleico , Nucleosomas/metabolismo , Iniciación de la Cadena Peptídica Traduccional , Empalme del ARN , Alineación de Secuencia , Transcripción GenéticaRESUMEN
Background: Infection by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) can lead to post-acute sequelae of SARS-CoV-2 (PASC) that can persist for weeks to years following initial viral infection. Clinical manifestations of PASC are heterogeneous and often involve multiple organs. While many hypotheses have been made on the mechanisms of PASC and its associated symptoms, the acute biological drivers of PASC are still unknown. Methods: We enrolled 494 patients with COVID-19 at their initial presentation to a hospital or clinic and followed them longitudinally to determine their development of PASC. From 341 patients, we conducted multi-omic profiling on peripheral blood samples collected shortly after study enrollment to investigate early immune signatures associated with the development of PASC. Results: During the first week of COVID-19, we observed a large number of differences in the immune profile of individuals who were hospitalized for COVID-19 compared to those individuals with COVID-19 who were not hospitalized. Differences between individuals who did or did not later develop PASC were, in comparison, more limited, but included significant differences in autoantibodies and in epigenetic and transcriptional signatures in double-negative 1 B cells, in particular. Conclusions: We found that early immune indicators of incident PASC were nuanced, with significant molecular signals manifesting predominantly in double-negative B cells, compared with the robust differences associated with hospitalization during acute COVID-19. The emerging acute differences in B cell phenotypes, especially in double-negative 1 B cells, in PASC patients highlight a potentially important role of these cells in the development of PASC.
Asunto(s)
COVID-19 , Humanos , SARS-CoV-2 , Síndrome Post Agudo de COVID-19 , Factores Inmunológicos , Autoanticuerpos , Progresión de la EnfermedadRESUMEN
Sequencing of multiple related species followed by comparative genomics analysis constitutes a powerful approach for the systematic understanding of any genome. Here, we use the genomes of 12 Drosophila species for the de novo discovery of functional elements in the fly. Each type of functional element shows characteristic patterns of change, or 'evolutionary signatures', dictated by its precise selective constraints. Such signatures enable recognition of new protein-coding genes and exons, spurious and incorrect gene annotations, and numerous unusual gene structures, including abundant stop-codon readthrough. Similarly, we predict non-protein-coding RNA genes and structures, and new microRNA (miRNA) genes. We provide evidence of miRNA processing and functionality from both hairpin arms and both DNA strands. We identify several classes of pre- and post-transcriptional regulatory motifs, and predict individual motif instances with high confidence. We also study how discovery power scales with the divergence and number of species compared, and we provide general guidelines for comparative studies.
Asunto(s)
Drosophila/clasificación , Drosophila/genética , Evolución Molecular , Genoma de los Insectos/genética , Genómica , Animales , Secuencia de Bases , Sitios de Unión , Secuencia Conservada , Proteínas de Drosophila/genética , Exones/genética , Regulación de la Expresión Génica/genética , Genes de Insecto/genética , MicroARNs/genética , Datos de Secuencia Molecular , Especificidad de Órganos , Filogenia , Regiones no Traducidas/genéticaRESUMEN
Insulators are DNA sequences that control the interactions among genomic regulatory elements and act as chromatin boundaries. A thorough understanding of their location and function is necessary to address the complexities of metazoan gene regulation. We studied by ChIP-chip the genome-wide binding sites of 6 insulator-associated proteins-dCTCF, CP190, BEAF-32, Su(Hw), Mod(mdg4), and GAF-to obtain the first comprehensive map of insulator elements in Drosophila embryos. We identify over 14,000 putative insulators, including all classically defined insulators. We find two major classes of insulators defined by dCTCF/CP190/BEAF-32 and Su(Hw), respectively. Distributional analyses of insulators revealed that particular sub-classes of insulator elements are excluded between cis-regulatory elements and their target promoters; divide differentially expressed, alternative, and divergent promoters; act as chromatin boundaries; are associated with chromosomal breakpoints among species; and are embedded within active chromatin domains. Together, these results provide a map demarcating the boundaries of gene regulatory units and a framework for understanding insulator function during the development and evolution of Drosophila.
Asunto(s)
Drosophila/genética , Genoma de los Insectos , Elementos Aisladores , Animales , Mapeo Cromosómico , Drosophila/metabolismo , Proteínas de Drosophila/genética , Proteínas de Drosophila/metabolismo , Unión ProteicaRESUMEN
The pandemic caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has led to a rapid response by the scientific community to further understand and combat its associated pathologic etiology. A focal point has been on the immune responses mounted during the acute and post-acute phases of infection, but the immediate post-diagnosis phase remains relatively understudied. We sought to better understand the immediate post-diagnosis phase by collecting blood from study participants soon after a positive test and identifying molecular associations with longitudinal disease outcomes. Multi-omic analyses identified differences in immune cell composition, cytokine levels, and cell subset-specific transcriptomic and epigenomic signatures between individuals on a more serious disease trajectory (Progressors) as compared to those on a milder course (Non-progressors). Higher levels of multiple cytokines were observed in Progressors, with IL-6 showing the largest difference. Blood monocyte cell subsets were also skewed, showing a comparative decrease in non-classical CD14-CD16+ and intermediate CD14+CD16+ monocytes. Additionally, in the lymphocyte compartment, CD8+ T effector memory cells displayed a gene expression signature consistent with stronger T cell activation in Progressors. Importantly, the identification of these cellular and molecular immune changes occurred at the early stages of COVID-19 disease. These observations could serve as the basis for the development of prognostic biomarkers of disease risk and interventional strategies to improve the management of severe COVID-19.
RESUMEN
BACKGROUND: Multiple sclerosis (MS) is an autoimmune condition of the central nervous system with a well-characterized genetic background. Prior analyses of MS genetics have identified broad enrichments across peripheral immune cells, yet the driver immune subsets are unclear. RESULTS: We utilize chromatin accessibility data across hematopoietic cells to identify cell type-specific enrichments of MS genetic signals. We find that CD4 T and B cells are independently enriched for MS genetics and further refine the driver subsets to Th17 and memory B cells, respectively. We replicate our findings in data from untreated and treated MS patients and find that immunomodulatory treatments suppress chromatin accessibility at driver cell types. Integration of statistical fine-mapping and chromatin interactions nominate numerous putative causal genes, illustrating complex interplay between shared and cell-specific genes. CONCLUSIONS: Overall, our study finds that open chromatin regions in CD4 T cells and B cells independently drive MS genetic signals. Our study highlights how careful integration of genetics and epigenetics can provide fine-scale insights into causal cell types and nominate new genes and pathways for disease.
Asunto(s)
Esclerosis Múltiple , Linfocitos B/metabolismo , Linfocitos T CD4-Positivos , Cromatina , Humanos , Inmunidad , Esclerosis Múltiple/genética , Esclerosis Múltiple/metabolismoRESUMEN
BACKGROUND: Recombination rate is non-uniformly distributed across the human genome. The variation of recombination rate at both fine and large scales cannot be fully explained by DNA sequences alone. Epigenetic factors, particularly DNA methylation, have recently been proposed to influence the variation in recombination rate. RESULTS: We study the relationship between recombination rate and gene regulatory domains, defined by a gene and its linked control elements. We define these links using expression quantitative trait loci (eQTLs), methylation quantitative trait loci (meQTLs), chromatin conformation from publicly available datasets (Hi-C and ChIA-PET), and correlated activity links that we infer across cell types. Each link type shows a "recombination rate valley" of significantly reduced recombination rate compared to matched control regions. This recombination rate valley is most pronounced for gene regulatory domains of early embryonic development genes, housekeeping genes, and constitutive regulatory elements, which are known to show increased evolutionary constraint across species. Recombination rate valleys show increased DNA methylation, reduced doublestranded break initiation, and increased repair efficiency, specifically in the lineage leading to the germ line. Moreover, by using only the overlap of functional links and DNA methylation in germ cells, we are able to predict the recombination rate with high accuracy. CONCLUSIONS: Our results suggest the existence of a recombination rate valley at regulatory domains and provide a potential molecular mechanism to interpret the interplay between genetic and epigenetic variations.
Asunto(s)
Metilación de ADN , Recombinación Genética , Secuencias Reguladoras de Ácidos Nucleicos , Animales , Cromosomas Humanos/química , Roturas del ADN de Doble Cadena , Desarrollo Embrionario/genética , Expresión Génica , Genes Esenciales , Humanos , Ratones , Sitios de Carácter CuantitativoRESUMEN
BACKGROUND: Advances in sequencing technology have boosted population genomics and made it possible to map the positions of transcription factor binding sites (TFBSs) with high precision. Here we investigate TFBS variability by combining transcription factor binding maps generated by ENCODE, modENCODE, our previously published data and other sources with genomic variation data for human individuals and Drosophila isogenic lines. RESULTS: We introduce a metric of TFBS variability that takes into account changes in motif match associated with mutation and makes it possible to investigate TFBS functional constraints instance-by-instance as well as in sets that share common biological properties. We also take advantage of the emerging per-individual transcription factor binding data to show evidence that TFBS mutations, particularly at evolutionarily conserved sites, can be efficiently buffered to ensure coherent levels of transcription factor binding. CONCLUSIONS: Our analyses provide insights into the relationship between individual and interspecies variation and show evidence for the functional buffering of TFBS mutations in both humans and flies. In a broad perspective, these results demonstrate the potential of combining functional genomics and population genetics approaches for understanding gene regulation.