RESUMO
Syntenic long non-coding RNAs (lncRNAs) often show limited sequence conservation across species, prompting concern in the field. This study delves into functional signatures of syntenic lncRNAs between humans and zebrafish. Syntenic lncRNAs are highly expressed in zebrafish, with â¼90 % located near protein-coding genes, either in sense or antisense orientation. During early zebrafish development and in human embryonic stem cells (H1-hESC), syntenic lncRNA loci are enriched with cis-regulatory repressor signatures, influencing the expression of development-associated genes. In later zebrafish developmental stages and specific human cell lines, these syntenic lncRNA loci function as enhancers or transcription start sites (TSS) for protein-coding genes. Analysis of transposable elements (TEs) in syntenic lncRNA sequences revealed intriguing patterns: human lncRNAs are enriched in simple repeat elements, while their zebrafish counterparts show enrichment in LTR elements. This sequence evolution likely arises from post-rearrangement mutations that enhance DNA elements or cis-regulatory functions. It may also contribute to vertebrate innovation by creating novel transcription factor binding sites within the locus. This study highlights the conserved functionality of syntenic lncRNA loci through DNA elements, emphasizing their conserved roles across species despite sequence divergence.
Assuntos
Evolução Molecular , RNA Longo não Codificante , Sintenia , Peixe-Zebra , Peixe-Zebra/genética , RNA Longo não Codificante/genética , Animais , Humanos , Elementos de DNA Transponíveis/genética , Sítio de Iniciação de Transcrição , Linhagem Celular , Sequência ConservadaRESUMO
Fanconi anemia (FA) is a rare genetic disease characterized by congenital abnormalities and increased risk for bone marrow failure and cancer. Central nervous system defects, including acute and irreversible loss of neurological function and white matter lesions with calcifications, have become increasingly recognized among FA patients, and are collectively referred to as Fanconi Anemia Neurological Syndrome or FANS. The molecular etiology of FANS is poorly understood. In this study, we have used a functional integrative genomics approach to further define the function of the FANCD2 protein and FA pathway. Combined analysis of new and existing FANCD2 ChIP-seq datasets demonstrates that FANCD2 binds nonrandomly throughout the genome with binding enriched at transcription start sites and in broad regions spanning protein-coding gene bodies. FANCD2 demonstrates a strong preference for large neural genes involved in neuronal differentiation, synapse function, and cell adhesion, with many of these genes implicated in neurodevelopmental and neuropsychiatric disorders. Furthermore, FANCD2 binds to regions of the genome that replicate late, undergo mitotic DNA synthesis (MiDAS) under conditions of replication stress, and are hotspots for copy number variation. Our analysis describes an important targeted role for FANCD2 and the FA pathway in the maintenance of large neural gene stability.
Assuntos
Variações do Número de Cópias de DNA , Proteína do Grupo de Complementação D2 da Anemia de Fanconi , Proteína do Grupo de Complementação D2 da Anemia de Fanconi/genética , Proteína do Grupo de Complementação D2 da Anemia de Fanconi/metabolismo , Humanos , Anemia de Fanconi/genética , Anemia de Fanconi/metabolismo , Neurônios/metabolismo , Replicação do DNA , Ligação Proteica , Sítio de Iniciação de TranscriçãoRESUMO
Mycobacterium tuberculosis (MTB) is a pathogen that is known for its ability to persist in harsh environments and cause chronic infections. Understanding the regulatory networks of MTB is crucial for developing effective treatments. Small regulatory RNAs (sRNAs) play important roles in gene expression regulation in all kingdoms of life, and their classification based solely on genomic location can be imprecise due to the computational-based prediction of protein-coding genes in bacteria, which often neglects segments of mRNA such as 5'UTRs, 3'UTRs, and intercistronic regions of operons. To address this issue, our study simultaneously discovered genomic features such as TSSs, UTRs, and operons together with sRNAs in the M. tuberculosis H37Rv strain (ATCC 27294) across multiple stress conditions. Our analysis identified 1,376 sRNA candidates and 8,173 TSSs in MTB, providing valuable insights into its complex regulatory landscape. TSS mapping enabled us to classify these sRNAs into more specific categories, including promoter-associated sRNAs, 5'UTR-derived sRNAs, 3'UTR-derived sRNAs, true intergenic sRNAs, and antisense sRNAs. Three of these sRNA candidates were experimentally validated using 3'-RACE-PCR: predictedRNA_0240, predictedRNA_0325, and predictedRNA_0578. Future characterization and validation are necessary to fully elucidate the functions and roles of these sRNAs in MTB. Our study is the first to simultaneously unravel TSSs and sRNAs in MTB and demonstrate that the identification of other genomic features, such as TSSs, UTRs, and operons, allows for more accurate and specific classification of sRNAs.
Assuntos
Mycobacterium tuberculosis , Óperon , RNA Bacteriano , Pequeno RNA não Traduzido , Sítio de Iniciação de Transcrição , Mycobacterium tuberculosis/genética , Mycobacterium tuberculosis/metabolismo , Pequeno RNA não Traduzido/genética , RNA Bacteriano/genética , Regiões 5' não Traduzidas , Regulação Bacteriana da Expressão Gênica , Estresse Fisiológico/genética , Genoma Bacteriano , Regiões 3' não Traduzidas , Anotação de Sequência MolecularRESUMO
RNA polymerase II (pol II) initiates transcription from transcription start sites (TSSs) located â¼30-35 bp downstream of the TATA box in metazoans, whereas in the yeast Saccharomyces cerevisiae, pol II scans further downstream TSSs located â¼40-120 bp downstream of the TATA box. Previously, we found that removal of the kinase module TFIIK (Kin28-Ccl1-Tfb3) from TFIIH shifts the TSS in a yeast in vitro system upstream to the location observed in metazoans and that addition of recombinant Tfb3 back to TFIIH-ΔTFIIK restores the downstream TSS usage. Here, we report that this biochemical activity of yeast TFIIK in TSS scanning is attributable to the Tfb3 RING domain at the interface with pol II in the pre-initiation complex (PIC): especially, swapping Tfb3 Pro51-a residue conserved among all fungi-with Ala or Ser as in MAT1, the metazoan homolog of Tfb3, confers an upstream TSS shift in vitro in a similar manner to the removal of TFIIK. Yeast genetic analysis suggests that both Pro51 and Arg64 of Tfb3 are required to maintain the stability of the Tfb3-pol II interface in the PIC. Cryo-electron microscopy analysis of a yeast PIC lacking TFIIK reveals considerable variability in the orientation of TFIIH, which impairs TSS scanning after promoter opening.
Assuntos
RNA Polimerase II , Proteínas de Saccharomyces cerevisiae , Saccharomyces cerevisiae , Sítio de Iniciação de Transcrição , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/química , RNA Polimerase II/metabolismo , RNA Polimerase II/química , RNA Polimerase II/genética , Interações Hidrofóbicas e Hidrofílicas , TATA Box/genética , Fator de Transcrição TFIIH/metabolismo , Fator de Transcrição TFIIH/genética , Fator de Transcrição TFIIH/química , Regiões Promotoras GenéticasRESUMO
Much of what we know about eukaryotic transcription stems from animals and yeast; however, plants evolved separately for over a billion years, leaving ample time for divergence in transcriptional regulation. Here we set out to elucidate fundamental properties of cis-regulatory sequences in plants. Using massively parallel reporter assays across four plant species, we demonstrate the central role of sequences downstream of the transcription start site (TSS) in transcriptional regulation. Unlike animal enhancers that are position independent, plant regulatory elements depend on their position, as altering their location relative to the TSS significantly affects transcription. We highlight the importance of the region downstream of the TSS in regulating transcription by identifying a DNA motif that is conserved across vascular plants and is sufficient to enhance gene expression in a dose-dependent manner. The identification of a large number of position-dependent enhancers points to fundamental differences in gene regulation between plants and animals.
Assuntos
Elementos Facilitadores Genéticos , Regulação da Expressão Gênica de Plantas , Sítio de Iniciação de Transcrição , Transcrição Gênica , Sequências Reguladoras de Ácido Nucleico/genética , Plantas/genética , Arabidopsis/genética , Regiões Promotoras GenéticasRESUMO
Differentiation of female germline stem cells into a mature oocyte includes the expression of RNAs and proteins that drive early embryonic development in Drosophila. We have little insight into what activates the expression of these maternal factors. One candidate is the zinc-finger protein OVO. OVO is required for female germline viability and has been shown to positively regulate its own expression, as well as a downstream target, ovarian tumor, by binding to the transcriptional start site (TSS). To find additional OVO targets in the female germline and further elucidate OVO's role in oocyte development, we performed ChIP-seq to determine genome-wide OVO occupancy, as well as RNA-seq comparing hypomorphic and wild type rescue ovo alleles. OVO preferentially binds in close proximity to target TSSs genome-wide, is associated with open chromatin, transcriptionally active histone marks, and OVO-dependent expression. Motif enrichment analysis on OVO ChIP peaks identified a 5'-TAACNGT-3' OVO DNA binding motif spatially enriched near TSSs. However, the OVO DNA binding motif does not exhibit precise motif spacing relative to the TSS characteristic of RNA polymerase II complex binding core promoter elements. Integrated genomics analysis showed that 525 genes that are bound and increase in expression downstream of OVO are known to be essential maternally expressed genes. These include genes involved in anterior/posterior/germ plasm specification (bcd, exu, swa, osk, nos, aub, pgc, gcl), egg activation (png, plu, gnu, wisp, C(3)g, mtrm), translational regulation (cup, orb, bru1, me31B), and vitelline membrane formation (fs(1)N, fs(1)M3, clos). This suggests that OVO is a master transcriptional regulator of oocyte development and is responsible for the expression of structural components of the egg as well as maternally provided RNAs that are required for early embryonic development.
Assuntos
Proteínas de Drosophila , Drosophila melanogaster , Sítio de Iniciação de Transcrição , Animais , Feminino , Proteínas de Drosophila/metabolismo , Proteínas de Drosophila/genética , Drosophila melanogaster/genética , Drosophila melanogaster/metabolismo , Drosophila melanogaster/embriologia , Regulação da Expressão Gênica no Desenvolvimento , Oócitos/metabolismo , Proteínas de Ligação a DNA , Fatores de TranscriçãoRESUMO
Alternative transcription start sites (TSS) are widespread in eukaryotes and can alter the 5' UTR length and coding potential of transcripts. Here we show that inorganic phosphate (Pi) availability regulates the usage of several alternative TSS in Arabidopsis (Arabidopsis thaliana). In comparison to phytohormone treatment, Pi had a pronounced and specific effect on the usage of many alternative TSS. By combining short-read RNA sequencing with long-read sequencing of full-length mRNAs, we identified a set of 45 genes showing alternative TSS under Pi deficiency. Alternative TSS affected several processes, such as translation via the exclusion of upstream open reading frames present in the 5' UTR of RETICULAN LIKE PROTEIN B1 mRNA, and subcellular localization via removal of the plastid transit peptide coding region from the mRNAs of HEME OXYGENASE 1 and SULFOQUINOVOSYLDIACYLGLYCEROL 2. Several alternative TSS also generated shorter transcripts lacking the coding potential for important domains. For example, the EVOLUTIONARILY CONSERVED C-TERMINAL REGION 4 (ECT4) locus, which encodes an N6-methyladenosine (m6A) reader, strongly expressed under Pi deficiency a short noncoding transcript (named ALTECT4) ~550 nt long with a TSS in the penultimate intron. The specific and robust induction of ALTECT4 production by Pi deficiency led to the identification of a role for m6A readers in primary root growth in response to low phosphate that is dependent on iron and is involved in modulating cell division in the root meristem. Our results identify alternative TSS usage as an important process in the plant response to Pi deficiency.
Assuntos
Regiões 5' não Traduzidas , Arabidopsis , Fosfatos , Sítio de Iniciação de Transcrição , Arabidopsis/genética , Arabidopsis/metabolismo , Fosfatos/deficiência , Fosfatos/metabolismo , Regiões 5' não Traduzidas/genética , Regulação da Expressão Gênica de Plantas , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Proteínas de Arabidopsis/genética , Proteínas de Arabidopsis/metabolismoRESUMO
In this study, we employed short- and long-read sequencing technologies to delineate the transcriptional architecture of the human monkeypox virus and to identify key regulatory elements that govern its gene expression. Specifically, we conducted a transcriptomic analysis to annotate the transcription start sites (TSSs) and transcription end sites (TESs) of the virus by utilizing Cap Analysis of gene expression sequencing on the Illumina platform and direct RNA sequencing on the Oxford Nanopore technology device. Our investigations uncovered significant complexity in the use of alternative TSSs and TESs in viral genes. In this research, we also detected the promoter elements and poly(A) signals associated with the viral genes. Additionally, we identified novel genes in both the left and right variable regions of the viral genome.IMPORTANCEGenerally, gaining insight into how the transcription of a virus is regulated offers insights into the key mechanisms that control its life cycle. The recent outbreak of the human monkeypox virus has underscored the necessity of understanding the basic biology of its causative agent. Our results are pivotal for constructing a comprehensive transcriptomic atlas of the human monkeypox virus, providing valuable resources for future studies.
Assuntos
Análise de Sequência de RNA , Sítio de Iniciação de Transcrição , Transcriptoma , Humanos , Análise de Sequência de RNA/métodos , Monkeypox virus/genética , Perfilação da Expressão Gênica , Genoma Viral , Regiões Promotoras Genéticas , Sequenciamento de Nucleotídeos em Larga Escala , RNA Viral/genéticaRESUMO
Transcription regulation in cestodes has been little studied. Here, we characterize the Taenia solium TATA-binding protein (TBP) gene. We found binding sites for transcription factors such as NF1, YY1, and AP-1 in the proximal promoter. We also identified two TATA-like elements in the promoter; however, neither could bind TBP. Additionally, we mapped the transcription start site (A+1) within an initiator and identified a putative downstream promoter element (DPE) located at +27 bp relative to the transcription start site. These two elements are important and functional for gene expression. Moreover, we identified the genes encoding T. solium TBP-Associated Factor 6 (TsTAF6) and 9 (TsTAF9). A Western blot assay revealed that both factors are expressed in the parasite; electrophoretic mobility shift assays and super-shift assays revealed interactions between the DPE probe and TsTAF6-TsTAF9. Finally, we used molecular dynamics simulations to formulate an interaction model among TsTAF6, TsTAF9, and the DPE probe; we stabilized the model with interactions between the histone fold domain pair in TAFs and several pairs of nucleotides in the DPE probe. We discuss novel and interesting features of the TsTAF6-TsTAF9 complex for interaction with DPE on T. solium promoters.
Assuntos
Regiões Promotoras Genéticas , Fatores Associados à Proteína de Ligação a TATA , Taenia solium , Animais , Taenia solium/genética , Taenia solium/metabolismo , Fatores Associados à Proteína de Ligação a TATA/genética , Fatores Associados à Proteína de Ligação a TATA/metabolismo , Ligação Proteica , Sítios de Ligação , Proteínas de Helminto/genética , Proteínas de Helminto/metabolismo , Proteína de Ligação a TATA-Box/metabolismo , Proteína de Ligação a TATA-Box/genética , Sítio de Iniciação de Transcrição , Simulação de Dinâmica Molecular , Regulação da Expressão GênicaRESUMO
HIV-1, the causative agent of AIDS, is a retrovirus that packages two copies of unspliced viral RNA as a dimer into newly budding virions. The unspliced viral RNA also serves as an mRNA template for translation of two polyproteins. Recent studies suggest that the fate of the viral RNA (genome or mRNA) is determined at the level of transcription. RNA polymerase II uses heterogeneous transcription start sites to generate major transcripts that differ in only two guanosines at the 5' end. Remarkably, this two-nucleotide difference is sufficient to alter the structure of the 5'-untranslated region and generate two pools of RNA with distinct functions. The presence of both RNA species is needed for optimal viral replication and fitness.
Assuntos
HIV-1 , Conformação de Ácido Nucleico , RNA Viral , Sítio de Iniciação de Transcrição , HIV-1/genética , HIV-1/fisiologia , RNA Viral/genética , RNA Viral/metabolismo , RNA Viral/química , Humanos , Regiões 5' não Traduzidas/genéticaRESUMO
The growth factor Neuregulin-1 (NRG1) has pleiotropic roles in proliferation and differentiation of the stem cell niche in different tissues. It has been implicated in gut, brain and muscle development and repair. Six isoform classes of NRG1 and over 28 protein isoforms have been previously described. Here we report a new class of NRG1, designated NRG1-VII to denote that these NRG1 isoforms arise from a myeloid-specific transcriptional start site (TSS) previously uncharacterized. Long-read sequencing was used to identify eight high-confidence NRG1-VII transcripts. These transcripts presented major structural differences from one another, through the use of cassette exons and alternative stop codons. Expression of NRG1-VII was confirmed in primary human monocytes and tissue resident macrophages and induced pluripotent stem cell-derived macrophages (iPSC-derived macrophages). Isoform switching via cassette exon usage and alternate polyadenylation was apparent during monocyte maturation and macrophage differentiation. NRG1-VII is the major class expressed by the myeloid lineage, including tissue-resident macrophages. Analysis of public gene expression data indicates that monocytes and macrophages are a primary source of NRG1. The size and structure of class VII isoforms suggests that they may be more diffusible through tissues than other NRG1 classes. However, the specific roles of class VII variants in tissue homeostasis and repair have not yet been determined.
Assuntos
Diferenciação Celular , Macrófagos , Neuregulina-1 , Isoformas de Proteínas , Humanos , Neuregulina-1/metabolismo , Neuregulina-1/genética , Macrófagos/metabolismo , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , Monócitos/metabolismo , Monócitos/citologia , Sítio de Iniciação de Transcrição , Células-Tronco Pluripotentes Induzidas/metabolismo , Células-Tronco Pluripotentes Induzidas/citologia , Éxons/genética , Processamento Alternativo , Células Mieloides/metabolismo , Células Mieloides/citologiaRESUMO
The canonical view of DNA methylation, a pivotal epigenetic regulation mechanism in eukaryotes, dictates its role as a suppressor of gene activity, particularly within promoter regions. However, this view is being challenged as it is becoming increasingly evident that the connection between DNA methylation and gene expression varies depending on the genomic location and is therefore more complex than initially thought. We examined DNA methylation levels in the gut epithelium of Atlantic salmon (Salmo salar) using whole-genome bisulfite sequencing, which we correlated with gene expression data from RNA sequencing of the same gut tissue sample (RNA-seq). Assuming epigenetic signals might be pronounced between distinctive phenotypes, we compared large and small fish, finding 22 significant associations between 22 differentially methylated regions and 21 genes. We did not detect significant methylation differences between large and small fish. However, we observed a consistent signal of methylation levels around the transcription start sites (TSS), being negatively correlated with the expression levels of those genes. We found both negative and positive associations of methylation levels with gene expression further upstream or downstream of the TSS, revealing a more unpredictable pattern. The 21 genes showing significant methylation-expression correlations were involved in biological processes related to salmon health, such as growth and immune responses. Deciphering how DNA methylation affects the expression of such genes holds great potential for future applications. For instance, our results suggest the importance of genomic context in targeting epigenetic modifications to improve the welfare of aquaculture species like Atlantic salmon.
Assuntos
Metilação de DNA , Epigênese Genética , Salmo salar , Animais , Salmo salar/genética , Salmo salar/metabolismo , Mucosa Intestinal/metabolismo , Sítio de Iniciação de TranscriçãoRESUMO
BACKGROUND: In vertebrates, most protein-coding genes have a peak of GC-content near their 5' transcriptional start site (TSS). This feature promotes both the efficient nuclear export and translation of mRNAs. Despite the importance of GC-content for RNA metabolism, its general features, origin, and maintenance remain mysterious. We investigate the evolutionary forces shaping GC-content at the transcriptional start site (TSS) of genes through both comparative genomic analysis of nucleotide substitution rates between different species and by examining human de novo mutations. RESULTS: Our data suggests that GC-peaks at TSSs were present in the last common ancestor of amniotes, and likely that of vertebrates. We observe that in apes and rodents, where recombination is directed away from TSSs by PRDM9, GC-content at the 5' end of protein-coding gene is currently undergoing mutational decay. In canids, which lack PRDM9 and perform recombination at TSSs, GC-content at the 5' end of protein-coding is increasing. We show that these patterns extend into the 5' end of the open reading frame, thus impacting synonymous codon position choices. CONCLUSIONS: Our results indicate that the dynamics of this GC-peak in amniotes is largely shaped by historic patterns of recombination. Since decay of GC-content towards the mutation rate equilibrium is the default state for non-functional DNA, the observed decrease in GC-content at TSSs in apes and rodents indicates that the GC-peak is not being maintained by selection on most protein-coding genes in those species.
Assuntos
Composição de Bases , Sítio de Iniciação de Transcrição , Humanos , Animais , Mutação , Evolução Molecular , Fases de Leitura AbertaRESUMO
Alternative transcription start sites can affect transcript isoform diversity and translation levels. In a recently described form of gene regulation, coordinated transcriptional and translational interference results in transcript isoform-dependent changes in protein expression. Specifically, a long undecoded transcript isoform (LUTI) is transcribed from a gene-distal promoter, interfering with expression of the gene-proximal promoter. Although transcriptional and chromatin features associated with LUTI expression have been described, the mechanism underlying LUTI-based transcriptional interference is not well understood. Using an unbiased genetic approach followed by functional genomics, we uncovered that the Swi/Snf chromatin remodeling complex is required for co-transcriptional nucleosome remodeling that leads to LUTI-based repression. We identified genes with tandem promoters that rely on Swi/Snf function for transcriptional interference during protein folding stress, including LUTI-regulated genes. This study provides clear evidence for Swi/Snf playing a direct role in gene repression via a cis transcriptional interference mechanism.
Assuntos
Montagem e Desmontagem da Cromatina , Proteínas Cromossômicas não Histona , Nucleossomos , Regiões Promotoras Genéticas , Proteínas de Saccharomyces cerevisiae , Saccharomyces cerevisiae , Fatores de Transcrição , Transcrição Gênica , Fatores de Transcrição/metabolismo , Fatores de Transcrição/genética , Proteínas de Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/metabolismo , Proteínas Cromossômicas não Histona/genética , Proteínas Cromossômicas não Histona/metabolismo , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Nucleossomos/metabolismo , Nucleossomos/genética , Regulação Fúngica da Expressão Gênica , Sítio de Iniciação de Transcrição , Cromatina/metabolismo , Cromatina/genéticaRESUMO
Alternative transcription start site (TSS) usage regulation has been identified as a major means of gene expression regulation in metazoans. However, in fungi, its impact remains elusive as its study has thus far been restricted to model yeasts. Here, we first re-analyzed TSS-seq data to define genuine TSS clusters in 2 species of pathogenic Cryptococcus. We identified 2 types of TSS clusters associated with specific DNA sequence motifs. Our analysis also revealed that alternative TSS usage regulation in response to environmental cues is widespread in Cryptococcus, altering gene expression and protein targeting. Importantly, we performed a forward genetic screen to identify a unique transcription factor (TF) named Tur1, which regulates alternative TSS (altTSS) usage genome-wide when cells switch from exponential phase to stationary phase. ChiP-Seq and DamID-Seq analyses suggest that at some loci, the role of Tur1 might be direct. Tur1 has been previously shown to be essential for virulence in C. neoformans. We demonstrated here that a tur1Δ mutant strain is more sensitive to superoxide stress and phagocytosed more efficiently by macrophages than the wild-type (WT) strain.
Assuntos
Proteínas Fúngicas , Regulação Fúngica da Expressão Gênica , Genoma Fúngico , Fatores de Transcrição , Sítio de Iniciação de Transcrição , Proteínas Fúngicas/genética , Proteínas Fúngicas/metabolismo , Fatores de Transcrição/metabolismo , Fatores de Transcrição/genética , Cryptococcus/genética , Cryptococcus/patogenicidade , Cryptococcus/metabolismo , Cryptococcus neoformans/genética , Cryptococcus neoformans/patogenicidade , Cryptococcus neoformans/metabolismo , Macrófagos/microbiologia , Macrófagos/metabolismo , Animais , Camundongos , Virulência/genética , Fagocitose/genéticaRESUMO
The H3K4 methyltransferase SETD1A plays an essential role in both development and cancer. However, essential components involved in SETD1A chromatin binding remain unclear. Here, we discovered that BOD1L exhibits the highest correlated SETD1A co-dependency in human cancer cell lines. BOD1L knockout reduces leukemia cells in vitro and in vivo, and mimics the transcriptional profiles observed in SETD1A knockout cells. The loss of BOD1L immediately reduced SETD1A distribution at transcriptional start sites (TSS), induced transcriptional elongation defect, and increased the RNA polymerase II content at TSS; however, it did not reduce H3K4me3. The Shg1 domain of BOD1L has a DNA binding ability, and a tryptophan residue (W104) in the domain recruits SETD1A to chromatin through the association with SETD1A FLOS domain. In addition, the BOD1L-SETD1A complex associates with transcriptional regulators, including E2Fs. These results reveal that BOD1L mediates chromatin and SETD1A, and regulates the non-canonical function of SETD1A in transcription.
Assuntos
Cromatina , Histona-Lisina N-Metiltransferase , Histonas , Animais , Humanos , Camundongos , Linhagem Celular Tumoral , Cromatina/metabolismo , Histona-Lisina N-Metiltransferase/metabolismo , Histona-Lisina N-Metiltransferase/genética , Histonas/metabolismo , Leucemia/genética , Leucemia/metabolismo , Ligação Proteica , Domínios Proteicos , RNA Polimerase II/metabolismo , Sítio de Iniciação de Transcrição , Transcrição GênicaRESUMO
Patterns of transcriptional activity are encoded in our genome through regulatory elements such as promoters or enhancers that, paradoxically, contain similar assortments of sequence-specific transcription factor (TF) binding sites1-3. Knowledge of how these sequence motifs encode multiple, often overlapping, gene expression programs is central to understanding gene regulation and how mutations in non-coding DNA manifest in disease4,5. Here, by studying gene regulation from the perspective of individual transcription start sites (TSSs), using natural genetic variation, perturbation of endogenous TF protein levels and massively parallel analysis of natural and synthetic regulatory elements, we show that the effect of TF binding on transcription initiation is position dependent. Analysing TF-binding-site occurrences relative to the TSS, we identified several motifs with highly preferential positioning. We show that these patterns are a combination of a TF's distinct functional profiles-many TFs, including canonical activators such as NRF1, NFY and Sp1, activate or repress transcription initiation depending on their precise position relative to the TSS. As such, TFs and their spacing collectively guide the site and frequency of transcription initiation. More broadly, these findings reveal how similar assortments of TF binding sites can generate distinct gene regulatory outcomes depending on their spatial configuration and how DNA sequence polymorphisms may contribute to transcription variation and disease and underscore a critical role for TSS data in decoding the regulatory information of our genome.
Assuntos
Regulação da Expressão Gênica , Motivos de Nucleotídeos , Regiões Promotoras Genéticas , Fatores de Transcrição , Sítio de Iniciação de Transcrição , Iniciação da Transcrição Genética , Humanos , Sítios de Ligação , Regulação da Expressão Gênica/genética , Genoma Humano/genética , Motivos de Nucleotídeos/genética , Regiões Promotoras Genéticas/genética , Ligação Proteica , Fatores de Transcrição/metabolismo , Variação GenéticaRESUMO
Transcribed enhancer maps can reveal nuclear interactions underpinning each cell type and connect specific cell types to diseases. Using a 5' single-cell RNA sequencing approach, we defined transcription start sites of enhancer RNAs and other classes of coding and noncoding RNAs in human CD4+ T cells, revealing cellular heterogeneity and differentiation trajectories. Integration of these datasets with single-cell chromatin profiles showed that active enhancers with bidirectional RNA transcription are highly cell type-specific and that disease heritability is strongly enriched in these enhancers. The resulting cell type-resolved multimodal atlas of bidirectionally transcribed enhancers, which we linked with promoters using fine-scale chromatin contact maps, enabled us to systematically interpret genetic variants associated with a range of immune-mediated diseases.
Assuntos
Linfócitos T CD4-Positivos , Elementos Facilitadores Genéticos , Predisposição Genética para Doença , Sítio de Iniciação de Transcrição , Transcrição Gênica , Humanos , Linfócitos T CD4-Positivos/imunologia , Diferenciação Celular , Cromatina/metabolismo , Cromatina/genética , Regiões Promotoras Genéticas , Linfócitos T Auxiliares-Indutores/imunologia , Análise da Expressão Gênica de Célula Única , Atlas como AssuntoRESUMO
The three-dimensional (3D) organization of chromatin within the nucleus is crucial for gene regulation. However, the 3D architectural features that coordinate the activation of an entire chromosome remain largely unknown. We introduce an omics method, RNA-associated chromatin DNA-DNA interactions, that integrates RNA polymerase II (RNAPII)-mediated regulome with stochastic optical reconstruction microscopy to investigate the landscape of noncoding RNA roX2-associated chromatin topology for gene equalization to achieve dosage compensation. Our findings reveal that roX2 anchors to the target gene transcription end sites (TESs) and spreads in a distinctive boot-shaped configuration, promoting a more open chromatin state for hyperactivation. Furthermore, roX2 arches TES to transcription start sites to enhance transcriptional loops, potentially facilitating RNAPII convoying and connecting proximal promoter-promoter transcriptional hubs for synergistic gene regulation. These TESs cluster as roX2 compartments, surrounded by inactive domains for coactivation of multiple genes within the roX2 territory. In addition, roX2 structures gradually form and scaffold for stepwise coactivation in dosage compensation.
Assuntos
Cromatina , RNA Polimerase II , Cromossomo X , Cromatina/metabolismo , Cromatina/genética , Cromossomo X/genética , RNA Polimerase II/metabolismo , RNA Polimerase II/genética , Animais , RNA não Traduzido/genética , Regulação da Expressão Gênica , Mecanismo Genético de Compensação de Dose , Regiões Promotoras Genéticas , Sítio de Iniciação de TranscriçãoRESUMO
Abnormal transcription initiation from alternative first exon has been reported to promote tumorigenesis. However, the prevalence and impact of gene expression regulation mediated by alternative tandem transcription initiation were mostly unknown in cancer. Here, we developed a robust computational method to analyze alternative tandem transcription start site (TSS) usage from standard RNA sequencing data. Applying this method to pan-cancer RNA sequencing datasets, we observed widespread dysregulation of tandem TSS usage in tumors, many of which were independent of changes in overall expression level or alternative first exon usage. We showed that the dynamics of tandem TSS usage was associated with epigenomic modulation. We found that significant 5' untranslated region shortening of gene TIMM13 contributed to increased protein production, and up-regulation of TIMM13 by CRISPR-mediated transcriptional activation promoted proliferation and migration of lung cancer cells. Our findings suggest that dysregulated tandem TSS usage represents an addtional layer of cancer-associated transcriptome alterations.