RESUMO
CyVerse, the largest publicly-funded open-source research cyberinfrastructure for life sciences, has played a crucial role in advancing data-driven research since the 2010s. As the technology landscape evolved with the emergence of cloud computing platforms, machine learning and artificial intelligence (AI) applications, CyVerse has enabled access by providing interfaces, Software as a Service (SaaS), and cloud-native Infrastructure as Code (IaC) to leverage new technologies. CyVerse services enable researchers to integrate institutional and private computational resources, custom software, perform analyses, and publish data in accordance with open science principles. Over the past 13 years, CyVerse has registered more than 124,000 verified accounts from 160 countries and was used for over 1,600 peer-reviewed publications. Since 2011, 45,000 students and researchers have been trained to use CyVerse. The platform has been replicated and deployed in three countries outside the US, with additional private deployments on commercial clouds for US government agencies and multinational corporations. In this manuscript, we present a strategic blueprint for creating and managing SaaS cyberinfrastructure and IaC as free and open-source software.
Assuntos
Inteligência Artificial , Software , Humanos , Computação em Nuvem , EditoraçãoRESUMO
Recent genomic data analyses have revealed important underlying logics in eukaryotic gene regulation, such as CpG islands (CGIs)-dependent dual-mode gene regulation. In mammals, genes lacking CGIs at their promoters are generally regulated by interconversion between euchromatin and heterochromatin, while genes associated with CGIs constitutively remain as euchromatin. Whether a similar mode of gene regulation exists in non-mammalian species has been unknown. Here, through comparative epigenomic analyses, we demonstrate that the dual-mode gene regulation program is common in various eukaryotes, even in the species lacking CGIs. In cases of vertebrates or plants, we find that genes associated with high methylation level promoters are inactivated by forming heterochromatin and expressed in a context-dependent manner. In contrast, the genes with low methylation level promoters are broadly expressed and remain as euchromatin even when repressed by Polycomb proteins. Furthermore, we show that invertebrate animals lacking DNA methylation, such as fruit flies and nematodes, also have divergence in gene types: some genes are regulated by Polycomb proteins, while others are regulated by heterochromatin formation. Altogether, our study establishes gene type divergence and the resulting dual-mode gene regulation as fundamental features shared in a broad range of higher eukaryotic species.
Assuntos
Regulação da Expressão Gênica , Animais , Caenorhabditis elegans/genética , Ilhas de CpG , Metilação de DNA , Drosophila melanogaster/genética , Epigênese Genética , Regulação da Expressão Gênica de Plantas , Regiões Promotoras Genéticas , Transcrição Gênica , Vertebrados/genéticaRESUMO
Plant cells undergo two types of cell cycles-the mitotic cycle in which DNA replication is coupled to mitosis, and the endocycle in which DNA replication occurs in the absence of cell division. To investigate DNA replication programs in these two types of cell cycles, we pulse labeled intact root tips of maize (Zea mays) with 5-ethynyl-2'-deoxyuridine (EdU) and used flow sorting of nuclei to examine DNA replication timing (RT) during the transition from a mitotic cycle to an endocycle. Comparison of the sequence-based RT profiles showed that most regions of the maize genome replicate at the same time during S phase in mitotic and endocycling cells, despite the need to replicate twice as much DNA in the endocycle and the fact that endocycling is typically associated with cell differentiation. However, regions collectively corresponding to 2% of the genome displayed significant changes in timing between the two types of cell cycles. The majority of these regions are small with a median size of 135 kb, shift to a later RT in the endocycle, and are enriched for genes expressed in the root tip. We found larger regions that shifted RT in centromeres of seven of the ten maize chromosomes. These regions covered the majority of the previously defined functional centromere, which ranged between 1 and 2 Mb in size in the reference genome. They replicate mainly during mid S phase in mitotic cells but primarily in late S phase of the endocycle. In contrast, the immediately adjacent pericentromere sequences are primarily late replicating in both cell cycles. Analysis of CENH3 enrichment levels in 8C vs 2C nuclei suggested that there is only a partial replacement of CENH3 nucleosomes after endocycle replication is complete. The shift to later replication of centromeres and possible reduction in CENH3 enrichment after endocycle replication is consistent with a hypothesis that centromeres are inactivated when their function is no longer needed.
Assuntos
Período de Replicação do DNA/genética , Replicação do DNA/efeitos dos fármacos , Raízes de Plantas/genética , Zea mays/genética , Núcleo Celular/efeitos dos fármacos , Núcleo Celular/genética , Centrômero/efeitos dos fármacos , Centrômero/genética , Replicação do DNA/genética , Período de Replicação do DNA/efeitos dos fármacos , DNA de Plantas/efeitos dos fármacos , DNA de Plantas/genética , Desoxiuridina/análogos & derivados , Desoxiuridina/farmacologia , Endocitose/efeitos dos fármacos , Meristema/efeitos dos fármacos , Meristema/genética , Mitose/efeitos dos fármacos , Mitose/genética , Nucleossomos/efeitos dos fármacos , Raízes de Plantas/efeitos dos fármacos , Raízes de Plantas/crescimento & desenvolvimento , Fase S/genética , Zea mays/crescimento & desenvolvimentoRESUMO
All plants and animals must replicate their DNA, using a regulated process to ensure that their genomes are completely and accurately replicated. DNA replication timing programs have been extensively studied in yeast and animal systems, but much less is known about the replication programs of plants. We report a novel adaptation of the "Repli-seq" assay for use in intact root tips of maize (Zea mays) that includes several different cell lineages and present whole-genome replication timing profiles from cells in early, mid, and late S phase of the mitotic cell cycle. Maize root tips have a complex replication timing program, including regions of distinct early, mid, and late S replication that each constitute between 20 and 24% of the genome, as well as other loci corresponding to â¼32% of the genome that exhibit replication activity in two different time windows. Analyses of genomic, transcriptional, and chromatin features of the euchromatic portion of the maize genome provide evidence for a gradient of early replicating, open chromatin that transitions gradually to less open and less transcriptionally active chromatin replicating in mid S phase. Our genomic level analysis also demonstrated that the centromere core replicates in mid S, before heavily compacted classical heterochromatin, including pericentromeres and knobs, which replicate during late S phase.
Assuntos
Período de Replicação do DNA/genética , Genômica , Meristema/citologia , Meristema/genética , Mitose/genética , Fase S/genética , Zea mays/citologia , Zea mays/genética , Sequência de Bases , Cromossomos de Plantas/genética , Elementos de DNA Transponíveis/genética , Genes de Plantas , Modelos Genéticos , Sequências de Repetição em Tandem/genética , Fatores de Tempo , Transcrição GênicaRESUMO
CpG islands (CGIs) have long been implicated in the regulation of vertebrate gene expression. However, the involvement of CGIs in chromosomal architectures and associated gene expression regulations has not yet been thoroughly explored. By combining large-scale integrative data analyses and experimental validations, we show that CGIs clearly reconcile two competing models explaining nuclear gene localizations. We first identify CGI-containing (CGI+) and CGI-less (CGI-) genes are non-randomly clustered within the genome, which reflects CGI-dependent spatial gene segregation in the nucleus and corresponding gene regulatory modes. Regardless of their transcriptional activities, CGI+ genes are mainly located at the nuclear center and encounter frequent long-range chromosomal interactions. Meanwhile, nuclear peripheral CGI- genes forming heterochromatin are activated and internalized into the nuclear center by local enhancer-promoter interactions. Our findings demonstrate the crucial implications of CGIs on chromosomal architectures and gene positioning, linking the critical importance of CGIs in determining distinct mechanisms of global gene regulation in three-dimensional space in the nucleus.
Assuntos
Cromossomos de Mamíferos/química , Ilhas de CpG , Regulação da Expressão Gênica , Animais , Linhagem Celular , Núcleo Celular/genética , Cromatina/química , Camundongos , Células NIH 3T3 , Transcrição GênicaRESUMO
Annexins are a multigene family of calcium-dependent membrane-binding proteins that play important roles in plant cell signaling. Annexins are multifunctional proteins, and their function in plants is not comprehensively understood. Arabidopsis (Arabidopsis thaliana) annexins ANN1 and ANN2 are 64% identical in their primary structure, and both are highly expressed in seedlings. Here, we showed that ann-mutant seedlings grown in the absence of sugar show decreased primary root growth and altered columella cells in root caps; however, these mutant defects are rescued by Suc, Glc, or Fru. In seedlings grown without sugar, significant up-regulation of photosynthetic gene expression and chlorophyll accumulation was found in ann-mutant cotyledons compared to that in wild type, which indicates potential sugar starvation in the roots of ann-mutant seedlings. Unexpectedly, the overall sugar content of ann-mutant primary roots was significantly higher than that of wild-type roots when grown without sugar. To examine the diffusion of sugar along the entire root to the root tip, we examined the unloading pattern of carboxyfluorescein dye and found that post-phloem sugar transport was impaired in ann-mutant root tips compared to that in wild type. Increased levels of ROS and callose were detected in the root tips of ann-mutant seedlings grown without Suc, the latter of which would restrict plasmodesmal sugar transport to root tips. Our results indicate that ANN1 and ANN2 play an important role in post-phloem sugar transport to the root tip, which in turn indirectly influences photosynthetic rates in cotyledons. This study expands our understanding of the function of annexins in plants.
Assuntos
Anexina A1/metabolismo , Anexina A2/metabolismo , Proteínas de Arabidopsis/metabolismo , Floema/metabolismo , Raízes de Plantas/metabolismo , Açúcares/metabolismo , Anexina A1/genética , Anexina A2/genética , Arabidopsis/genética , Arabidopsis/crescimento & desenvolvimento , Arabidopsis/metabolismo , Proteínas de Arabidopsis/genética , Transporte Biológico/genética , Cotilédone/genética , Cotilédone/crescimento & desenvolvimento , Cotilédone/metabolismo , Regulação da Expressão Gênica no Desenvolvimento , Regulação da Expressão Gênica de Plantas , Meristema/genética , Meristema/crescimento & desenvolvimento , Meristema/metabolismo , Mutação , Floema/genética , Fotossíntese/genética , Raízes de Plantas/genética , Raízes de Plantas/crescimento & desenvolvimento , Plântula/genética , Plântula/crescimento & desenvolvimento , Plântula/metabolismoRESUMO
Eukaryotes use a temporally regulated process, known as the replication timing program, to ensure that their genomes are fully and accurately duplicated during S phase. Replication timing programs are predictive of genomic features and activity and are considered to be functional readouts of chromatin organization. Although replication timing programs have been described for yeast and animal systems, much less is known about the temporal regulation of plant DNA replication or its relationship to genome sequence and chromatin structure. We used the thymidine analog, 5-ethynyl-2'-deoxyuridine, in combination with flow sorting and Repli-Seq to describe, at high-resolution, the genome-wide replication timing program for Arabidopsis (Arabidopsis thaliana) Col-0 suspension cells. We identified genomic regions that replicate predominantly during early, mid, and late S phase, and correlated these regions with genomic features and with data for chromatin state, accessibility, and long-distance interaction. Arabidopsis chromosome arms tend to replicate early while pericentromeric regions replicate late. Early and mid-replicating regions are gene-rich and predominantly euchromatic, while late regions are rich in transposable elements and primarily heterochromatic. However, the distribution of chromatin states across the different times is complex, with each replication time corresponding to a mixture of states. Early and mid-replicating sequences interact with each other and not with late sequences, but early regions are more accessible than mid regions. The replication timing program in Arabidopsis reflects a bipartite genomic organization with early/mid-replicating regions and late regions forming separate, noninteracting compartments. The temporal order of DNA replication within the early/mid compartment may be modulated largely by chromatin accessibility.
Assuntos
Arabidopsis/genética , Cromatina/genética , Cromossomos de Plantas , Período de Replicação do DNA , Cromatina/metabolismo , Elementos de DNA Transponíveis , Citometria de Fluxo , Genoma de Planta , Estudo de Associação Genômica Ampla , Fase S/genética , Análise de Sequência de DNA/métodosRESUMO
The maize genome is relatively large (â¼ 2.3 Gb) and has a complex organization of interspersed genes and transposable elements, which necessitates frequent boundaries between different types of chromatin. The examination of maize genes and conserved noncoding sequences revealed that many of these are flanked by regions of elevated asymmetric CHH (where H is A, C, or T) methylation (termed mCHH islands). These mCHH islands are quite short (â¼ 100 bp), are enriched near active genes, and often occur at the edge of the transposon that is located nearest to genes. The analysis of DNA methylation in other sequence contexts and several chromatin modifications revealed that mCHH islands mark the transition from heterochromatin-associated modifications to euchromatin-associated modifications. The presence of an mCHH island is fairly consistent in several distinct tissues that were surveyed but shows some variation among different haplotypes. The presence of insertion/deletions in promoters often influences the presence and position of an mCHH island. The mCHH islands are dependent upon RNA-directed DNA methylation activities and are lost in mop1 and mop3 mutants, but the nearby genes rarely exhibit altered expression levels. Instead, loss of an mCHH island is often accompanied by additional loss of DNA methylation in CG and CHG contexts associated with heterochromatin in nearby transposons. This suggests that mCHH islands and RNA-directed DNA methylation near maize genes may act to preserve the silencing of transposons from activity of nearby genes.
Assuntos
Metilação de DNA/genética , Eucromatina/genética , Genoma de Planta , Heterocromatina/genética , RNA de Plantas/metabolismo , Zea mays/genética , Sequência Conservada/genética , Ilhas de CpG/genética , DNA Intergênico/genética , Perfilação da Expressão Gênica , Regulação da Expressão Gênica de Plantas , Genes de Plantas , Genótipo , Mutação INDEL/genética , Sequências Repetidas Invertidas/genética , Sítio de Iniciação de TranscriçãoRESUMO
BACKGROUND: Replication timing experiments that use label incorporation and high throughput sequencing produce peaked data similar to ChIP-Seq experiments. However, the differences in experimental design, coverage density, and possible results make traditional ChIP-Seq analysis methods inappropriate for use with replication timing. RESULTS: To accurately detect and classify regions of replication across the genome, we present Repliscan. Repliscan robustly normalizes, automatically removes outlying and uninformative data points, and classifies Repli-seq signals into discrete combinations of replication signatures. The quality control steps and self-fitting methods make Repliscan generally applicable and more robust than previous methods that classify regions based on thresholds. CONCLUSIONS: Repliscan is simple and effective to use on organisms with different genome sizes. Even with analysis window sizes as small as 1 kilobase, reliable profiles can be generated with as little as 2.4x coverage.
Assuntos
Período de Replicação do DNA , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Software , Genoma , Tamanho do GenomaRESUMO
DNA methylation can play important roles in the regulation of transposable elements and genes. A collection of mutant alleles for 11 maize (Zea mays) genes predicted to play roles in controlling DNA methylation were isolated through forward- or reverse-genetic approaches. Low-coverage whole-genome bisulfite sequencing and high-coverage sequence-capture bisulfite sequencing were applied to mutant lines to determine context- and locus-specific effects of these mutations on DNA methylation profiles. Plants containing mutant alleles for components of the RNA-directed DNA methylation pathway exhibit loss of CHH methylation at many loci as well as CG and CHG methylation at a small number of loci. Plants containing loss-of-function alleles for chromomethylase (CMT) genes exhibit strong genome-wide reductions in CHG methylation and some locus-specific loss of CHH methylation. In an attempt to identify stocks with stronger reductions in DNA methylation levels than provided by single gene mutations, we performed crosses to create double mutants for the maize CMT3 orthologs, Zmet2 and Zmet5, and for the maize DDM1 orthologs, Chr101 and Chr106. While loss-of-function alleles are viable as single gene mutants, the double mutants were not recovered, suggesting that severe perturbations of the maize methylome may have stronger deleterious phenotypic effects than in Arabidopsis thaliana.
Assuntos
Metilação de DNA , Regulação da Expressão Gênica de Plantas , Zea mays/genética , Alelos , Cruzamentos Genéticos , DNA (Citosina-5-)-Metiltransferases/genética , Epigenômica , Genes de Plantas , MutaçãoRESUMO
DNA methylation is a stable modification of chromatin that can contribute to epigenetic variation through the regulation of genes or transposons. Profiling of DNA methylation in five maize (Zea mays) inbred lines found that while DNA methylation levels for more than 99% of the analyzed genomic regions are similar, there are still 5,000 to 20,000 context-specific differentially methylated regions (DMRs) between any two genotypes. The analysis of identical-by-state genomic regions that have limited genetic variation provided evidence that DMRs can occur without local sequence variation, but they are less common than in regions with genetic variation. Characterization of the sequence specificity of DMRs, location of DMRs relative to genes and transposons, and patterns of DNA methylation in regions flanking DMRs reveals a distinct subset of DMRs. Transcriptome profiling of the same tissue revealed that only approximately 20% of genes with qualitative (on-off) differences in gene expression are associated with DMRs, and there is little evidence for association of DMRs with genes that show quantitative differences in gene expression. We also identify a set of genes that may represent cryptic information that is silenced by DNA methylation in the reference B73 genome. Many of these genes exhibit natural variation in other genotypes, suggesting the potential for selection to act upon existing epigenetic natural variation. This study provides insights into the origin and influences of DMRs in a crop species with a complex genome organization.
Assuntos
Metilação de DNA , Epigênese Genética , Variação Genética , Genoma de Planta/genética , Zea mays/genética , Cruzamento , Perfilação da Expressão Gênica , Regulação da Expressão Gênica de Plantas , GenótipoRESUMO
DNA methylation is a chromatin modification that is frequently associated with epigenetic regulation in plants and mammals. However, genetic changes such as transposon insertions can also lead to changes in DNA methylation. Genome-wide profiles of DNA methylation for 20 maize (Zea mays) inbred lines were used to discover differentially methylated regions (DMRs). The methylation level for each of these DMRs was also assayed in 31 additional maize or teosinte genotypes, resulting in the discovery of 1966 common DMRs and 1754 rare DMRs. Analysis of recombinant inbred lines provides evidence that the majority of DMRs are heritable. A local association scan found that nearly half of the DMRs with common variation are significantly associated with single nucleotide polymorphisms found within or near the DMR. Many of the DMRs that are significantly associated with local genetic variation are found near transposable elements that may contribute to the variation in DNA methylation. Analysis of gene expression in the same samples used for DNA methylation profiling identified over 300 genes with expression patterns that are significantly associated with DNA methylation variation. Collectively, our results suggest that DNA methylation variation is influenced by genetic and epigenetic changes that are often stably inherited and can influence the expression of nearby genes.
Assuntos
Metilação de DNA/genética , Epigênese Genética , Variação Genética , Zea mays/genética , Análise por Conglomerados , Genótipo , Endogamia , Padrões de Herança/genética , Modelos Genéticos , Recombinação Genética/genética , Reprodutibilidade dos TestesRESUMO
Plant cells release ATP into their extracellular matrix as they grow, and extracellular ATP (eATP) can modulate the rate of cell growth in diverse tissues. Two closely related apyrases (APYs) in Arabidopsis (Arabidopsis thaliana), APY1 and APY2, function, in part, to control the concentration of eATP. The expression of APY1/APY2 can be inhibited by RNA interference, and this suppression leads to an increase in the concentration of eATP in the extracellular medium and severely reduces growth. To clarify how the suppression of APY1 and APY2 is linked to growth inhibition, the gene expression changes that occur in seedlings when apyrase expression is suppressed were assayed by microarray and quantitative real-time-PCR analyses. The most significant gene expression changes induced by APY suppression were in genes involved in biotic stress responses, which include those genes regulating wall composition and extensibility. These expression changes predicted specific chemical changes in the walls of mutant seedlings, and two of these changes, wall lignification and decreased methyl ester bonds, were verified by direct analyses. Taken together, the results are consistent with the hypothesis that APY1, APY2, and eATP play important roles in the signaling steps that link biotic stresses to plant defense responses and growth changes.
Assuntos
Trifosfato de Adenosina/metabolismo , Apirase/metabolismo , Proteínas de Arabidopsis/metabolismo , Arabidopsis/enzimologia , Parede Celular/metabolismo , Regulação da Expressão Gênica de Plantas , Estresse Fisiológico , Apirase/genética , Arabidopsis/citologia , Arabidopsis/fisiologia , Proteínas de Arabidopsis/genética , Parede Celular/enzimologia , Regulação para Baixo/genética , Matriz Extracelular/genética , Matriz Extracelular/metabolismo , Espaço Extracelular/metabolismo , Ontologia Genética , Genes de Plantas , Peróxido de Hidrogênio/metabolismo , Lignina/metabolismo , Mutação/genética , Análise de Sequência com Séries de Oligonucleotídeos , Peroxidase/metabolismo , Raízes de Plantas/metabolismo , Interferência de RNA , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Reação em Cadeia da Polimerase em Tempo Real , Reprodutibilidade dos Testes , Estresse Fisiológico/genética , Regulação para Cima/genéticaRESUMO
DNA replication during S phase in eukaryotes is a highly regulated process that ensures the accurate transmission of genetic material to daughter cells during cell division. Replication follows a well-defined temporal program, which has been studied extensively in humans, Drosophila, and yeast, where it is clear that the replication process is both temporally and spatially ordered. The replication timing (RT) program is increasingly considered to be a functional readout of genomic features and chromatin organization. Although there is increasing evidence that plants display important differences in their DNA replication process compared to animals, RT programs in plants have not been extensively studied. To address this deficiency, we developed an improved protocol for the genome-wide RT analysis by sequencing newly replicated DNA ("Repli-seq") and applied it to the characterization of RT in maize root tips. Our protocol uses 5-ethynyl-2'-deoxyuridine (EdU) to label replicating DNA in vivo in intact roots. Our protocol also eliminates the need for synchronization and frequently associated chemical perturbations as well as the need for cell cultures, which can accumulate genetic and epigenetic differences over time. EdU can be fluorescently labeled under mild conditions and does not degrade subnuclear structure, allowing for the differentiation of labeled and unlabeled nuclei by flow sorting, effectively eliminating contamination issues that can result from sorting on DNA content alone. We also developed an analysis pipeline for analyzing and classifying regions of replication and present it in a point-and-click application called Repliscan that eliminates the need for command line programming.
Assuntos
Período de Replicação do DNA , Meristema , Animais , DNA , Replicação do DNA , Humanos , Fase SRESUMO
DNA methylation is a chromatin modification that can provide epigenetic regulation of gene and transposon expression. Plants utilize several pathways to establish and maintain DNA methylation in specific sequence contexts. The chromomethylase (CMT) genes maintain CHG (where H = A, C or T) methylation. The RNA-directed DNA methylation (RdDM) pathway is important for CHH methylation. Transcriptome analysis was performed in a collection of Zea mays lines carrying mutant alleles for CMT or RdDM-associated genes. While the majority of the transcriptome was not affected, we identified sets of genes and transposon families sensitive to context-specific decreases in DNA methylation in mutant lines. Many of the genes that are up-regulated in CMT mutant lines have high levels of CHG methylation, while genes that are differentially expressed in RdDM mutants are enriched for having nearby mCHH islands, implicating context-specific DNA methylation in the regulation of expression for a small number of genes. Many genes regulated by CMTs exhibit natural variation for DNA methylation and transcript abundance in a panel of diverse inbred lines. Transposon families with differential expression in the mutant genotypes show few defining features, though several families up-regulated in RdDM mutants show enriched expression in endosperm tissue, highlighting the potential importance for this pathway during reproduction. Taken together, our findings suggest that while the number of genes and transposon families whose expression is reproducibly affected by mild perturbations in context-specific methylation is small, there are distinct patterns for loci impacted by RdDM and CMT mutants.
Assuntos
DNA (Citosina-5-)-Metiltransferases/metabolismo , Metilação de DNA/genética , Elementos de DNA Transponíveis/genética , Inativação Gênica , Genes de Plantas , RNA de Plantas/genética , Zea mays/genética , Perfilação da Expressão Gênica , Regulação da Expressão Gênica de Plantas , Loci Gênicos , Mutação/genética , RNA de Plantas/metabolismo , Regulação para Cima/genéticaRESUMO
Epigenetic modification of DNA through methylation is known to be involved in multiple biological processes such as gene suppression. However, the exact mechanism of how DNA methylations play their part is yet unclear. In mammals, CpG islands (CGI) have been studied extensively for their involvement in cancer. Whereas in plants, despite the fact that there are not only CpG but also CHG and CHH contexts of methylation, an efficient and easy-to-use pipeline to decipher these phenomena is still to be developed. Both ZED-align and BisuKit are user-friendly apps deployed on CyVerse infrastructure where users can use their bisulfite sequence files to run multiple command line-based packages with minimal intervention. © 2016 by John Wiley & Sons, Inc.
RESUMO
Both transcriptional and epigenetic regulations are fundamental for the control of eukaryotic gene expression. Here we perform a compendium analysis of >200 large sequencing data sets to elucidate the regulatory logic of global gene expression programs in mouse embryonic stem (ES) cells. We define four major classes of DNA-binding proteins (Core, PRC, MYC and CTCF) based on their target co-occupancy, and discover reciprocal regulation between the MYC and PRC classes for the activity of nearly all genes under the control of the CpG island (CGI)-containing promoters. This CGI-dependent regulatory mode explains the functional segregation between CGI-containing and CGI-less genes during early development. By defining active enhancers based on the co-occupancy of the Core class, we further demonstrate their additive roles in CGI-containing gene expression and cell type-specific roles in CGI-less gene expression. Altogether, our analyses provide novel insights into previously unknown CGI-dependent global gene regulatory modes.
Assuntos
Ilhas de CpG/genética , Metilação de DNA/genética , Proteínas de Ligação a DNA/genética , Células-Tronco Embrionárias/citologia , Regulação da Expressão Gênica/genética , Animais , Sequência de Bases , Linhagem Celular , Proteínas de Ligação a DNA/classificação , Elementos Facilitadores Genéticos/genética , Genes Reguladores , Camundongos , Proteínas do Grupo Polycomb/genética , Regiões Promotoras Genéticas , Proteínas Proto-Oncogênicas c-myc/genética , Análise de Sequência de DNARESUMO
DNA methylation and dimethylation of lysine 9 of histone H3 (H3K9me2) are two chromatin modifications that can be associated with gene expression or recombination rate. The maize genome provides a complex landscape of interspersed genes and transposons. The genome-wide distribution of DNA methylation and H3K9me2 were investigated in seedling tissue for the maize inbred B73 and compared to patterns of these modifications observed in Arabidopsis thaliana. Most maize transposons are highly enriched for DNA methylation in CG and CHG contexts and for H3K9me2. In contrast to findings in Arabidopsis, maize CHH levels in transposons are generally low but some sub-families of transposons are enriched for CHH methylation and these families exhibit low levels of H3K9me2. The profile of modifications over genes reveals that DNA methylation and H3K9me2 is quite low near the beginning and end of genes. Although elevated CG and CHG methylation are found within gene bodies, CHH and H3K9me2 remain low. Maize has much higher levels of CHG methylation within gene bodies than observed in Arabidopsis and this is partially attributable to the presence of transposons within introns for some maize genes. These transposons are associated with high levels of CHG methylation and H3K9me2 but do not appear to prevent transcriptional elongation. Although the general trend is for a strong depletion of H3K9me2 and CHG near the transcription start site there are some putative genes that have high levels of these chromatin modifications. This study provides a clear view of the relationship between DNA methylation and H3K9me2 in the maize genome and how the distribution of these modifications is shaped by the interplay of genes and transposons.