Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 3.812
Filter
1.
Epigenetics Chromatin ; 17(1): 17, 2024 May 21.
Article in English | MEDLINE | ID: mdl-38773468

ABSTRACT

BACKGROUND: Insulator-binding proteins (IBPs) play a critical role in genome architecture by forming and maintaining contact domains. While the involvement of several IBPs in organising chromatin architecture in Drosophila has been described, the specific contribution of the Suppressor of Hairy wings (Su(Hw)) insulator-binding protein to genome topology remains unclear. RESULTS: In this study, we provide evidence for the existence of long-range interactions between chromatin bound Su(Hw) and Combgap, which was first characterised as Polycomb response elements binding protein. Loss of Su(Hw) binding to chromatin results in the disappearance of Su(Hw)-Combgap long-range interactions and in a decrease in spatial self-interactions among a subset of Su(Hw)-bound genome sites. Our findings suggest that Su(Hw)-Combgap long-range interactions are associated with active chromatin rather than Polycomb-directed repression. Furthermore, we observe that the majority of transcription start sites that are down-regulated upon loss of Su(Hw) binding to chromatin are located within 2 kb of Combgap peaks and exhibit Su(Hw)-dependent changes in Combgap and transcriptional regulators' binding. CONCLUSIONS: This study demonstrates that Su(Hw) insulator binding protein can form long-range interactions with Combgap, Polycomb response elements binding protein, and that these interactions are associated with active chromatin factors rather than with Polycomb dependent repression.


Subject(s)
Chromatin , Drosophila Proteins , Animals , Drosophila Proteins/metabolism , Drosophila Proteins/genetics , Chromatin/metabolism , Drosophila melanogaster/metabolism , Repressor Proteins/metabolism , Repressor Proteins/genetics , Protein Binding , DNA-Binding Proteins/metabolism , Transcription Initiation Site , Polycomb-Group Proteins/metabolism , Drosophila/metabolism
2.
J Comput Biol ; 31(5): 445-457, 2024 May.
Article in English | MEDLINE | ID: mdl-38752891

ABSTRACT

ABSTRACT An alternative transcription start site (ATSS) is a major driving force for increasing the complexity of transcripts in human tissues. As a transcriptional regulatory mechanism, ATSS has biological significance. Many studies have confirmed that ATSS plays an important role in diseases and cell development and differentiation. However, exploration of its dynamic mechanisms remains insufficient. Identifying ATSS change points during cell differentiation is critical for elucidating potential dynamic mechanisms. For relative ATSS usage as percentage data, the existing methods lack sensitivity to detect the change point for ATSS longitudinal data. In addition, some methods have strict requirements for data distribution and cannot be applied to deal with this problem. In this study, the Bayesian change point detection model was first constructed using reparameterization techniques for two parameters of a beta distribution for the percentage data type, and the posterior distributions of parameters and change points were obtained using Markov Chain Monte Carlo (MCMC) sampling. With comprehensive simulation studies, the performance of the Bayesian change point detection model is found to be consistently powerful and robust across most scenarios with different sample sizes and beta distributions. Second, differential ATSS events in the real data, whose change points were identified using our method, were clustered according to their change points. Last, for each change point, pathway and transcription factor motif analyses were performed on its differential ATSS events. The results of our analyses demonstrated the effectiveness of the Bayesian change point detection model and provided biological insights into cell differentiation.


Subject(s)
Bayes Theorem , Cell Differentiation , Transcription Initiation Site , Cell Differentiation/genetics , Humans , Markov Chains , Monte Carlo Method , Models, Genetic , Algorithms , Computer Simulation
3.
Nucleic Acids Res ; 52(8): 4393-4408, 2024 May 08.
Article in English | MEDLINE | ID: mdl-38587182

ABSTRACT

Local mutation rates in human are highly heterogeneous, with known variability at the scale of megabase-sized chromosomal domains, and, on the other extreme, at the scale of oligonucleotides. The intermediate, kilobase-scale heterogeneity in mutation risk is less well characterized. Here, by analyzing thousands of somatic genomes, we studied mutation risk gradients along gene bodies, representing a genomic scale spanning roughly 1-10 kb, hypothesizing that different mutational mechanisms are differently distributed across gene segments. The main heterogeneity concerns several kilobases at the transcription start site and further downstream into 5' ends of gene bodies; these are commonly hypomutated with several mutational signatures, most prominently the ubiquitous C > T changes at CpG dinucleotides. The width and shape of this mutational coldspot at 5' gene ends is variable across genes, and corresponds to variable interval of lowered DNA methylation depending on gene activity level and regulation. Such hypomutated loci, at 5' gene ends or elsewhere, correspond to DNA hypomethylation that can associate with various landmarks, including intragenic enhancers, Polycomb-marked regions, or chromatin loop anchor points. Tissue-specific DNA hypomethylation begets tissue-specific local hypomutation. Of note, direction of mutation risk is inverted for AID/APOBEC3 cytosine deaminase activity, whose signatures are enriched in hypomethylated regions.


Subject(s)
CpG Islands , DNA Methylation , Mutation Rate , Humans , Mutation , Transcription Initiation Site , Genome, Human , Genetic Heterogeneity
4.
Nucleic Acids Res ; 52(9): 5179-5194, 2024 May 22.
Article in English | MEDLINE | ID: mdl-38647081

ABSTRACT

Transcription factor RBPJ is the central component in Notch signal transduction and directly forms a coactivator complex together with the Notch intracellular domain (NICD). While RBPJ protein levels remain constant in most tissues, dynamic expression of Notch target genes varies depending on the given cell-type and the Notch activity state. To elucidate dynamic RBPJ binding genome-wide, we investigated RBPJ occupancy by ChIP-Seq. Surprisingly, only a small set of the total RBPJ sites show a dynamic binding behavior in response to Notch signaling. Compared to static RBPJ sites, dynamic sites differ in regard to their chromatin state, binding strength and enhancer positioning. Dynamic RBPJ sites are predominantly located distal to transcriptional start sites (TSSs), while most static sites are found in promoter-proximal regions. Importantly, gene responsiveness is preferentially associated with dynamic RBPJ binding sites and this static and dynamic binding behavior is repeatedly observed across different cell types and species. Based on the above findings we used a machine-learning algorithm to predict Notch responsiveness with high confidence in different cellular contexts. Our results strongly support the notion that the combination of binding strength and enhancer positioning are indicative of Notch responsiveness.


Subject(s)
Immunoglobulin J Recombination Signal Sequence-Binding Protein , Receptors, Notch , Immunoglobulin J Recombination Signal Sequence-Binding Protein/metabolism , Immunoglobulin J Recombination Signal Sequence-Binding Protein/genetics , Receptors, Notch/metabolism , Receptors, Notch/genetics , Binding Sites , Humans , Mice , Enhancer Elements, Genetic , Animals , Signal Transduction/genetics , Protein Binding , Promoter Regions, Genetic , Genomics/methods , Chromatin/metabolism , Chromatin/genetics , Transcription Initiation Site , Chromatin Immunoprecipitation Sequencing , Machine Learning , Gene Expression Regulation
5.
Science ; 384(6694): eadj0116, 2024 Apr 26.
Article in English | MEDLINE | ID: mdl-38662817

ABSTRACT

Transcription initiation is a process that is essential to ensuring the proper function of any gene, yet we still lack a unified understanding of sequence patterns and rules that explain most transcription start sites in the human genome. By predicting transcription initiation at base-pair resolution from sequences with a deep learning-inspired explainable model called Puffin, we show that a small set of simple rules can explain transcription initiation at most human promoters. We identify key sequence patterns that contribute to human promoter activity, each activating transcription with distinct position-specific effects. Furthermore, we explain the sequence basis of bidirectional transcription at promoters, identify the links between promoter sequence and gene expression variation across cell types, and explore the conservation of sequence determinants of transcription initiation across mammalian species.


Subject(s)
Genome, Human , Promoter Regions, Genetic , Transcription Initiation Site , Transcription Initiation, Genetic , Humans , Deep Learning , Animals , Base Sequence
6.
Nat Commun ; 15(1): 3561, 2024 Apr 26.
Article in English | MEDLINE | ID: mdl-38670996

ABSTRACT

Lysine lactylation (Kla) links metabolism and gene regulation and plays a key role in multiple biological processes. However, the regulatory mechanism and functional consequence of Kla remain to be explored. Here, we report that HBO1 functions as a lysine lactyltransferase to regulate transcription. We show that HBO1 catalyzes the addition of Kla in vitro and intracellularly, and E508 is a key site for the lactyltransferase activity of HBO1. Quantitative proteomic analysis further reveals 95 endogenous Kla sites targeted by HBO1, with the majority located on histones. Using site-specific antibodies, we find that HBO1 may preferentially catalyze histone H3K9la and scaffold proteins including JADE1 and BRPF2 can promote the enzymatic activity for histone Kla. Notably, CUT&Tag assays demonstrate that HBO1 is required for histone H3K9la on transcription start sites (TSSs). Besides, the regulated Kla can promote key signaling pathways and tumorigenesis, which is further supported by evaluating the malignant behaviors of HBO1- knockout (KO) tumor cells, as well as the level of histone H3K9la in clinical tissues. Our study reveals HBO1 serves as a lactyltransferase to mediate a histone Kla-dependent gene transcription.


Subject(s)
Histones , Host Cell Factor C1 , Lysine , Transcription, Genetic , Histones/metabolism , Humans , Lysine/metabolism , HEK293 Cells , Animals , Cell Line, Tumor , Transcription Initiation Site , Gene Expression Regulation , Mice , Protein Processing, Post-Translational
7.
Science ; 384(6694): 382-383, 2024 Apr 26.
Article in English | MEDLINE | ID: mdl-38662850

ABSTRACT

A deep-learning model reveals the rules that define transcription initiation.


Subject(s)
DNA , Transcription Initiation Site , Transcription Initiation, Genetic , Humans , Deep Learning , DNA/genetics , Promoter Regions, Genetic
8.
Genome Biol ; 25(1): 111, 2024 Apr 29.
Article in English | MEDLINE | ID: mdl-38685090

ABSTRACT

BACKGROUND: Untranslated regions (UTRs) are important mediators of post-transcriptional regulation. The length of UTRs and the composition of regulatory elements within them are known to vary substantially across genes, but little is known about the reasons for this variation in humans. Here, we set out to determine whether this variation, specifically in 5'UTRs, correlates with gene dosage sensitivity. RESULTS: We investigate 5'UTR length, the number of alternative transcription start sites, the potential for alternative splicing, the number and type of upstream open reading frames (uORFs) and the propensity of 5'UTRs to form secondary structures. We explore how these elements vary by gene tolerance to loss-of-function (LoF; using the LOEUF metric), and in genes where changes in dosage are known to cause disease. We show that LOEUF correlates with 5'UTR length and complexity. Genes that are most intolerant to LoF have longer 5'UTRs, greater TSS diversity, and more upstream regulatory elements than their LoF tolerant counterparts. We show that these differences are evident in disease gene-sets, but not in recessive developmental disorder genes where LoF of a single allele is tolerated. CONCLUSIONS: Our results confirm the importance of post-transcriptional regulation through 5'UTRs in tight regulation of mRNA and protein levels, particularly for genes where changes in dosage are deleterious and lead to disease. Finally, to support gene-based investigation we release a web-based browser tool, VuTR, that supports exploration of the composition of individual 5'UTRs and the impact of genetic variation within them.


Subject(s)
5' Untranslated Regions , Open Reading Frames , Protein Biosynthesis , Humans , Gene Dosage , Gene Expression Regulation , Transcription Initiation Site , Alternative Splicing , Nucleic Acid Conformation
9.
Nucleic Acids Res ; 52(9): 5016-5032, 2024 May 22.
Article in English | MEDLINE | ID: mdl-38471819

ABSTRACT

Viruses are master remodelers of the host cell environment in support of infection and virus production. For example, viruses typically regulate cell gene expression through modulating canonical cell promoter activity. Here, we show that Epstein Barr virus (EBV) replication causes 'de novo' transcription initiation at 29674 new transcription start sites throughout the cell genome. De novo transcription initiation is facilitated in part by the unique properties of the viral pre-initiation complex (vPIC) that binds a TATT[T/A]AA, TATA box-like sequence and activates transcription with minimal support by additional transcription factors. Other de novo promoters are driven by the viral transcription factors, Zta and Rta and are influenced by directional proximity to existing canonical cell promoters, a configuration that fosters transcription through existing promoters and transcriptional interference. These studies reveal a new way that viruses interact with the host transcriptome to inhibit host gene expression and they shed light on primal features driving eukaryotic promoter function.


Subject(s)
Herpesvirus 4, Human , Promoter Regions, Genetic , Virus Replication , Humans , Virus Replication/genetics , Herpesvirus 4, Human/genetics , Herpesvirus 4, Human/physiology , Transcription Initiation Site , Transcription Initiation, Genetic , Host-Pathogen Interactions/genetics , Viral Proteins/metabolism , Viral Proteins/genetics , Transcription Factors/metabolism , Transcription Factors/genetics , Transcription, Genetic , TATA Box/genetics
10.
Int J Mol Sci ; 25(3)2024 Jan 25.
Article in English | MEDLINE | ID: mdl-38338773

ABSTRACT

Since the discovery of physical peculiarities around transcription start sites (TSSs) and a site corresponding to the TATA box, research has revealed only the average features of these sites. Unsettled enigmas include the individual genes with these features and whether they relate to gene function. Herein, using 10 physical properties of DNA, including duplex DNA free energy, base stacking energy, protein-induced deformability, and stabilizing energy of Z-DNA, we clarified for the first time that approximately 97% of the promoters of 21,056 human protein-coding genes have distinctive physical properties around the TSS and/or position -27; of these, nearly 65% exhibited such properties at both sites. Furthermore, about 55% of the 21,056 genes had a minimum value of regional duplex DNA free energy within TSS-centered ±300 bp regions. Notably, distinctive physical properties within the promoters and free energies of the surrounding regions separated human protein-coding genes into five groups; each contained specific gene ontology (GO) terms. The group represented by immune response genes differed distinctly from the other four regarding the parameter of the free energies of the surrounding regions. A vital suggestion from this study is that physical-feature-based analyses of genomes may reveal new aspects of the organization and regulation of genes.


Subject(s)
DNA , Humans , Promoter Regions, Genetic , TATA Box/genetics , Transcription Initiation Site
11.
Biochim Biophys Acta Gene Regul Mech ; 1867(2): 195021, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38417480

ABSTRACT

The lysine 4 of histone H3 (H3K4) can be methylated or acetylated into four states: H3K4me1, H3K4me2, H3K4me3, or H3K4ac. Unlike H3K4 methylation, the genome-wide distribution and functional roles of H3K4ac remain unclear. To understand the relationship of acetylation with methylation at H3K4 and to explore the roles of H3K4ac in the context of chromatin, we analyzed H3K4ac across the human genome and compared it with H3K4 methylation in K562 cells. H3K4ac was positively correlated with H3K4me1/2/3 in reciprocal analysis. A decrease in H3K4ac through the mutation of the histone acetyltransferase p300 reduced H3K4me1 and H3K4me3 at the H3K4ac peaks. H3K4ac was also impaired by H3K4me depletion in the histone methyltransferase MLL3/4-mutated cells. H3K4ac peaks were enriched at enhancers in addition to the transcription start sites (TSSs) of genes. H3K4ac of TSSs and enhancers was positively correlated with mRNA and eRNA transcription. A decrease in H3K4ac reduced H3K4me3 and H3K4me1 in TSSs and enhancers, respectively, and inhibited the eviction of histone H3 from them. The mRNA transcription of highly transcribed genes was affected by the reduced H3K4ac. Interestingly, H3K4ac played a redundant role with regard to H3K27ac in eRNA transcription. These results indicate that H3K4ac serves as a marker of both active TSSs and enhancers and plays a role in histone eviction and RNA transcription by leading to H3K4me1/3.


Subject(s)
Enhancer Elements, Genetic , Histones , Transcription Initiation Site , Transcription, Genetic , Histones/metabolism , Humans , K562 Cells , Acetylation , Methylation , Chromatin/metabolism , RNA/metabolism , RNA/genetics
12.
Nature ; 627(8003): 424-430, 2024 Mar.
Article in English | MEDLINE | ID: mdl-38418874

ABSTRACT

Mycobacterium tuberculosis (Mtb) is a bacterial pathogen that causes tuberculosis (TB), an infectious disease that is responsible for major health and economic costs worldwide1. Mtb encounters diverse environments during its life cycle and responds to these changes largely by reprogramming its transcriptional output2. However, the mechanisms of Mtb transcription and how they are regulated remain poorly understood. Here we use a sequencing method that simultaneously determines both termini of individual RNA molecules in bacterial cells3 to profile the Mtb transcriptome at high resolution. Unexpectedly, we find that most Mtb transcripts are incomplete, with their 5' ends aligned at transcription start sites and 3' ends located 200-500 nucleotides downstream. We show that these short RNAs are mainly associated with paused RNA polymerases (RNAPs) rather than being products of premature termination. We further show that the high propensity of Mtb RNAP to pause early in transcription relies on the binding of the σ-factor. Finally, we show that a translating ribosome promotes transcription elongation, revealing a potential role for transcription-translation coupling in controlling Mtb gene expression. In sum, our findings depict a mycobacterial transcriptome that prominently features incomplete transcripts resulting from RNAP pausing. We propose that the pausing phase constitutes an important transcriptional checkpoint in Mtb that allows the bacterium to adapt to environmental changes and could be exploited for TB therapeutics.


Subject(s)
Gene Expression Regulation, Bacterial , Mycobacterium tuberculosis , RNA, Bacterial , Transcriptome , DNA-Directed RNA Polymerases/metabolism , Mycobacterium tuberculosis/genetics , Mycobacterium tuberculosis/metabolism , RNA, Bacterial/analysis , RNA, Bacterial/biosynthesis , RNA, Bacterial/genetics , Transcriptome/genetics , Tuberculosis/microbiology , RNA, Messenger/analysis , RNA, Messenger/biosynthesis , RNA, Messenger/genetics , Transcription Initiation Site , Sigma Factor/metabolism , Ribosomes/metabolism , Protein Biosynthesis
13.
Bioinformatics ; 40(3)2024 Mar 04.
Article in English | MEDLINE | ID: mdl-38407414

ABSTRACT

MOTIVATION: Prediction and identification of core promoter elements and transcription factor binding sites is essential for understanding the mechanism of transcription initiation and deciphering the biological activity of a specific locus. Thus, there is a need for an up-to-date tool to detect and curate core promoter elements/motifs in any provided nucleotide sequences. RESULTS: Here, we introduce ElemeNT 2023-a new and enhanced version of the Elements Navigation Tool, which provides novel capabilities for assessing evolutionary conservation and for readily evaluating the quality of high-throughput transcription start site (TSS) datasets, leveraging preferential motif positioning. ElemeNT 2023 is accessible both as a fast web-based tool and via command line (no coding skills are required to run the tool). While this tool is focused on core promoter elements, it can also be used for searching any user-defined motif, including sequence-specific DNA binding sites. Furthermore, ElemeNT's CORE database, which contains predicted core promoter elements around annotated TSSs, is now expanded to cover 10 species, ranging from worms to human. In this applications note, we describe the new workflow and demonstrate a case study using ElemeNT 2023 for core promoter composition analysis of diverse species, revealing motif prevalence and highlighting evolutionary insights. We discuss how this tool facilitates the exploration of uncharted transcriptomic data, appraises TSS quality, and aids in designing synthetic promoters for gene expression optimization. Taken together, ElemeNT 2023 empowers researchers with comprehensive tools for meticulous analysis of sequence elements and gene expression strategies. AVAILABILITY AND IMPLEMENTATION: ElemeNT 2023 is freely available at https://www.juven-gershonlab.org/resources/element-v2023/. The source code and command line version of ElemeNT 2023 are available at https://github.com/OritAdato/ElemeNT. No coding skills are required to run the tool.


Subject(s)
Software , Humans , Promoter Regions, Genetic , Protein Binding , Transcription Initiation Site
14.
G3 (Bethesda) ; 14(3)2024 03 06.
Article in English | MEDLINE | ID: mdl-38253712

ABSTRACT

Transcriptional initiation is among the first regulated steps controlling eukaryotic gene expression. High-throughput profiling of fungal and animal genomes has revealed that RNA Polymerase II often initiates transcription in both directions at the promoter transcription start site, but generally only elongates productively into the gene body. Additionally, Pol II can initiate transcription in both directions at cis-regulatory elements such as enhancers. These bidirectional RNA Polymerase II initiation events can be observed directly with methods that capture nascent transcripts, and they are also revealed indirectly by the presence of transcription-associated histone modifications on both sides of the transcription start site or cis-regulatory elements. Previous studies have shown that nascent RNAs and transcription-associated histone modifications in the model plant Arabidopsis thaliana accumulate mainly in the gene body, suggesting that transcription does not initiate widely in the upstream direction from genes in this plant. We compared transcription-associated histone modifications and nascent transcripts at both transcription start sites and cis-regulatory elements in A. thaliana, Drosophila melanogaster, and Homo sapiens. Our results provide evidence for mostly unidirectional RNA Polymerase II initiation at both promoters and gene-proximal cis-regulatory elements of A. thaliana, whereas bidirectional transcription initiation is observed widely at promoters in both D. melanogaster and H. sapiens, as well as cis-regulatory elements in Drosophila. Furthermore, the distribution of transcription-associated histone modifications around transcription start sites in the Oryza sativa (rice) and Glycine max (soybean) genomes suggests that unidirectional transcription initiation is the norm in these genomes as well. These results suggest that there are fundamental differences in transcriptional initiation directionality between flowering plant and metazoan genomes, which are manifested as distinct patterns of chromatin modifications around RNA polymerase initiation sites.


Subject(s)
Arabidopsis , Chromatin , Animals , Chromatin/genetics , RNA Polymerase II/genetics , RNA Polymerase II/metabolism , Transcription, Genetic , Drosophila melanogaster/genetics , Drosophila melanogaster/metabolism , Arabidopsis/genetics , Arabidopsis/metabolism , Transcription Initiation Site , Plants/genetics
15.
Nat Struct Mol Biol ; 31(1): 190-202, 2024 Jan.
Article in English | MEDLINE | ID: mdl-38177677

ABSTRACT

Transcription start site (TSS) selection is a key step in gene expression and occurs at many promoter positions over a wide range of efficiencies. Here we develop a massively parallel reporter assay to quantitatively dissect contributions of promoter sequence, nucleoside triphosphate substrate levels and RNA polymerase II (Pol II) activity to TSS selection by 'promoter scanning' in Saccharomyces cerevisiae (Pol II MAssively Systematic Transcript End Readout, 'Pol II MASTER'). Using Pol II MASTER, we measure the efficiency of Pol II initiation at 1,000,000 individual TSS sequences in a defined promoter context. Pol II MASTER confirms proposed critical qualities of S. cerevisiae TSS -8, -1 and +1 positions, quantitatively, in a controlled promoter context. Pol II MASTER extends quantitative analysis to surrounding sequences and determines that they tune initiation over a wide range of efficiencies. These results enabled the development of a predictive model for initiation efficiency based on sequence. We show that genetic perturbation of Pol II catalytic activity alters initiation efficiency mostly independently of TSS sequence, but selectively modulates preference for the initiating nucleotide. Intriguingly, we find that Pol II initiation efficiency is directly sensitive to guanosine-5'-triphosphate levels at the first five transcript positions and to cytosine-5'-triphosphate and uridine-5'-triphosphate levels at the second position genome wide. These results suggest individual nucleoside triphosphate levels can have transcript-specific effects on initiation, representing a cryptic layer of potential regulation at the level of Pol II biochemical properties. The results establish Pol II MASTER as a method for quantitative dissection of transcription initiation in eukaryotes.


Subject(s)
Polyphosphates , RNA Polymerase II , Saccharomyces cerevisiae , RNA Polymerase II/metabolism , Saccharomyces cerevisiae/metabolism , Base Sequence , Transcription Initiation Site , Nucleosides , Transcription, Genetic , Guanosine Triphosphate
16.
Nucleic Acids Res ; 52(D1): D322-D333, 2024 Jan 05.
Article in English | MEDLINE | ID: mdl-37956335

ABSTRACT

Transposable elements (TEs) are abundant in the genome and serve as crucial regulatory elements. Some TEs function as epigenetically regulated promoters, and these TE-derived transcription start sites (TSSs) play a crucial role in regulating genes associated with specific functions, such as cancer and embryogenesis. However, the lack of an accessible database that systematically gathers TE-derived TSS data is a current research gap. To address this, we established TE-TSS, an integrated data resource of human and mouse TE-derived TSSs (http://xozhanglab.com/TETSS). TE-TSS has compiled 2681 RNA sequencing datasets, spanning various tissues, cell lines and developmental stages. From these, we identified 5768 human TE-derived TSSs and 2797 mouse TE-derived TSSs, with 47% and 38% being experimentally validated, respectively. TE-TSS enables comprehensive exploration of TSS usage in diverse samples, providing insights into tissue-specific gene expression patterns and transcriptional regulatory elements. Furthermore, TE-TSS compares TE-derived TSS regions across 15 mammalian species, enhancing our understanding of their evolutionary and functional aspects. The establishment of TE-TSS facilitates further investigations into the roles of TEs in shaping the transcriptomic landscape and offers valuable resources for comprehending their involvement in diverse biological processes.


Subject(s)
DNA Transposable Elements , Databases, Genetic , Regulatory Sequences, Nucleic Acid , Transcription Initiation Site , Animals , Humans , Mice , DNA Transposable Elements/genetics , Mammals/genetics , Promoter Regions, Genetic , Sequence Analysis, RNA , Internet
17.
Nucleic Acids Res ; 52(2): e7, 2024 Jan 25.
Article in English | MEDLINE | ID: mdl-37994784

ABSTRACT

Precise detection of the transcriptional start site (TSS) is a key for characterizing transcriptional regulation of genes and for annotation of newly sequenced genomes. Here, we describe the development of an improved method, designated 'TSS-seq2.' This method is an iterative improvement of TSS-seq, a previously published enzymatic cap-structure conversion method to detect TSSs in base sequences. By modifying the original procedure, including by introducing split ligation at the key cap-selection step, the yield and the accuracy of the reaction has been substantially improved. For example, TSS-seq2 can be conducted using as little as 5 ng of total RNA with an overall accuracy of 96%; this yield a less-biased and more precise detection of TSS. We then applied TSS-seq2 for TSS analysis of four plant species that had not yet been analyzed by any previous TSS method.


Subject(s)
Sequence Analysis, RNA , Transcription Initiation Site , Base Sequence , Gene Expression Regulation , Promoter Regions, Genetic , Sequence Analysis, RNA/methods
19.
Nat Commun ; 14(1): 7240, 2023 11 09.
Article in English | MEDLINE | ID: mdl-37945584

ABSTRACT

Five-prime single-cell RNA-seq (scRNA-seq) has been widely employed to profile cellular transcriptomes, however, its power of analysing transcription start sites (TSS) has not been fully utilised. Here, we present a computational method suite, CamoTSS, to precisely identify TSS and quantify its expression by leveraging the cDNA on read 1, which enables effective detection of alternative TSS usage. With various experimental data sets, we have demonstrated that CamoTSS can accurately identify TSS and the detected alternative TSS usages showed strong specificity in different biological processes, including cell types across human organs, the development of human thymus, and cancer conditions. As evidenced in nasopharyngeal cancer, alternative TSS usage can also reveal regulatory patterns including systematic TSS dysregulations.


Subject(s)
Nasopharyngeal Neoplasms , Humans , Transcription Initiation Site , Single-Cell Gene Expression Analysis , Transcriptome/genetics , Phenotype , Single-Cell Analysis/methods
20.
Nat Struct Mol Biol ; 30(12): 1970-1984, 2023 Dec.
Article in English | MEDLINE | ID: mdl-37996663

ABSTRACT

Global changes in transcriptional regulation and RNA metabolism are crucial features of cancer development. However, little is known about the role of the core promoter in defining transcript identity and post-transcriptional fates, a potentially crucial layer of transcriptional regulation in cancer. In this study, we use CAGE-seq analysis to uncover widespread use of dual-initiation promoters in which non-canonical, first-base-cytosine (C) transcription initiation occurs alongside first-base-purine initiation across 59 human cancers and healthy tissues. C-initiation is often followed by a 5' terminal oligopyrimidine (5'TOP) sequence, dramatically increasing the range of genes potentially subjected to 5'TOP-associated post-transcriptional regulation. We show selective, dynamic switching between purine and C-initiation site usage, indicating transcription initiation-level regulation in cancers. We additionally detail global metabolic changes in C-initiation transcripts that mark differentiation status, proliferative capacity, radiosensitivity, and response to irradiation and to PI3K-Akt-mTOR and DNA damage pathway-targeted radiosensitization therapies in colorectal cancer organoids and cancer cell lines and tissues.


Subject(s)
Phosphatidylinositol 3-Kinases , RNA , Humans , Transcription Initiation Site , RNA/genetics , Cell Proliferation , Purines
SELECTION OF CITATIONS
SEARCH DETAIL
...