Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 3.781
Filtrar
1.
Nucleic Acids Res ; 52(8): 4393-4408, 2024 May 08.
Artigo em Inglês | MEDLINE | ID: mdl-38587182

RESUMO

Local mutation rates in human are highly heterogeneous, with known variability at the scale of megabase-sized chromosomal domains, and, on the other extreme, at the scale of oligonucleotides. The intermediate, kilobase-scale heterogeneity in mutation risk is less well characterized. Here, by analyzing thousands of somatic genomes, we studied mutation risk gradients along gene bodies, representing a genomic scale spanning roughly 1-10 kb, hypothesizing that different mutational mechanisms are differently distributed across gene segments. The main heterogeneity concerns several kilobases at the transcription start site and further downstream into 5' ends of gene bodies; these are commonly hypomutated with several mutational signatures, most prominently the ubiquitous C > T changes at CpG dinucleotides. The width and shape of this mutational coldspot at 5' gene ends is variable across genes, and corresponds to variable interval of lowered DNA methylation depending on gene activity level and regulation. Such hypomutated loci, at 5' gene ends or elsewhere, correspond to DNA hypomethylation that can associate with various landmarks, including intragenic enhancers, Polycomb-marked regions, or chromatin loop anchor points. Tissue-specific DNA hypomethylation begets tissue-specific local hypomutation. Of note, direction of mutation risk is inverted for AID/APOBEC3 cytosine deaminase activity, whose signatures are enriched in hypomethylated regions.


Assuntos
Ilhas de CpG , Metilação de DNA , Taxa de Mutação , Humanos , Mutação , Sítio de Iniciação de Transcrição , Genoma Humano , Heterogeneidade Genética
2.
Nat Commun ; 15(1): 3561, 2024 Apr 26.
Artigo em Inglês | MEDLINE | ID: mdl-38670996

RESUMO

Lysine lactylation (Kla) links metabolism and gene regulation and plays a key role in multiple biological processes. However, the regulatory mechanism and functional consequence of Kla remain to be explored. Here, we report that HBO1 functions as a lysine lactyltransferase to regulate transcription. We show that HBO1 catalyzes the addition of Kla in vitro and intracellularly, and E508 is a key site for the lactyltransferase activity of HBO1. Quantitative proteomic analysis further reveals 95 endogenous Kla sites targeted by HBO1, with the majority located on histones. Using site-specific antibodies, we find that HBO1 may preferentially catalyze histone H3K9la and scaffold proteins including JADE1 and BRPF2 can promote the enzymatic activity for histone Kla. Notably, CUT&Tag assays demonstrate that HBO1 is required for histone H3K9la on transcription start sites (TSSs). Besides, the regulated Kla can promote key signaling pathways and tumorigenesis, which is further supported by evaluating the malignant behaviors of HBO1- knockout (KO) tumor cells, as well as the level of histone H3K9la in clinical tissues. Our study reveals HBO1 serves as a lactyltransferase to mediate a histone Kla-dependent gene transcription.


Assuntos
Histonas , Fator C1 de Célula Hospedeira , Lisina , Transcrição Gênica , Histonas/metabolismo , Humanos , Lisina/metabolismo , Células HEK293 , Animais , Linhagem Celular Tumoral , Sítio de Iniciação de Transcrição , Regulação da Expressão Gênica , Camundongos , Processamento de Proteína Pós-Traducional
3.
Science ; 384(6694): eadj0116, 2024 Apr 26.
Artigo em Inglês | MEDLINE | ID: mdl-38662817

RESUMO

Transcription initiation is a process that is essential to ensuring the proper function of any gene, yet we still lack a unified understanding of sequence patterns and rules that explain most transcription start sites in the human genome. By predicting transcription initiation at base-pair resolution from sequences with a deep learning-inspired explainable model called Puffin, we show that a small set of simple rules can explain transcription initiation at most human promoters. We identify key sequence patterns that contribute to human promoter activity, each activating transcription with distinct position-specific effects. Furthermore, we explain the sequence basis of bidirectional transcription at promoters, identify the links between promoter sequence and gene expression variation across cell types, and explore the conservation of sequence determinants of transcription initiation across mammalian species.


Assuntos
Genoma Humano , Regiões Promotoras Genéticas , Sítio de Iniciação de Transcrição , Iniciação da Transcrição Genética , Humanos , Aprendizado Profundo , Animais , Sequência de Bases
4.
Science ; 384(6694): 382-383, 2024 Apr 26.
Artigo em Inglês | MEDLINE | ID: mdl-38662850

RESUMO

A deep-learning model reveals the rules that define transcription initiation.


Assuntos
DNA , Sítio de Iniciação de Transcrição , Iniciação da Transcrição Genética , Humanos , Aprendizado Profundo , DNA/genética , Regiões Promotoras Genéticas
5.
Genome Biol ; 25(1): 111, 2024 Apr 29.
Artigo em Inglês | MEDLINE | ID: mdl-38685090

RESUMO

BACKGROUND: Untranslated regions (UTRs) are important mediators of post-transcriptional regulation. The length of UTRs and the composition of regulatory elements within them are known to vary substantially across genes, but little is known about the reasons for this variation in humans. Here, we set out to determine whether this variation, specifically in 5'UTRs, correlates with gene dosage sensitivity. RESULTS: We investigate 5'UTR length, the number of alternative transcription start sites, the potential for alternative splicing, the number and type of upstream open reading frames (uORFs) and the propensity of 5'UTRs to form secondary structures. We explore how these elements vary by gene tolerance to loss-of-function (LoF; using the LOEUF metric), and in genes where changes in dosage are known to cause disease. We show that LOEUF correlates with 5'UTR length and complexity. Genes that are most intolerant to LoF have longer 5'UTRs, greater TSS diversity, and more upstream regulatory elements than their LoF tolerant counterparts. We show that these differences are evident in disease gene-sets, but not in recessive developmental disorder genes where LoF of a single allele is tolerated. CONCLUSIONS: Our results confirm the importance of post-transcriptional regulation through 5'UTRs in tight regulation of mRNA and protein levels, particularly for genes where changes in dosage are deleterious and lead to disease. Finally, to support gene-based investigation we release a web-based browser tool, VuTR, that supports exploration of the composition of individual 5'UTRs and the impact of genetic variation within them.


Assuntos
Regiões 5' não Traduzidas , Fases de Leitura Aberta , Biossíntese de Proteínas , Humanos , Dosagem de Genes , Regulação da Expressão Gênica , Sítio de Iniciação de Transcrição , Processamento Alternativo , Conformação de Ácido Nucleico
6.
Nature ; 627(8003): 424-430, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38418874

RESUMO

Mycobacterium tuberculosis (Mtb) is a bacterial pathogen that causes tuberculosis (TB), an infectious disease that is responsible for major health and economic costs worldwide1. Mtb encounters diverse environments during its life cycle and responds to these changes largely by reprogramming its transcriptional output2. However, the mechanisms of Mtb transcription and how they are regulated remain poorly understood. Here we use a sequencing method that simultaneously determines both termini of individual RNA molecules in bacterial cells3 to profile the Mtb transcriptome at high resolution. Unexpectedly, we find that most Mtb transcripts are incomplete, with their 5' ends aligned at transcription start sites and 3' ends located 200-500 nucleotides downstream. We show that these short RNAs are mainly associated with paused RNA polymerases (RNAPs) rather than being products of premature termination. We further show that the high propensity of Mtb RNAP to pause early in transcription relies on the binding of the σ-factor. Finally, we show that a translating ribosome promotes transcription elongation, revealing a potential role for transcription-translation coupling in controlling Mtb gene expression. In sum, our findings depict a mycobacterial transcriptome that prominently features incomplete transcripts resulting from RNAP pausing. We propose that the pausing phase constitutes an important transcriptional checkpoint in Mtb that allows the bacterium to adapt to environmental changes and could be exploited for TB therapeutics.


Assuntos
Regulação Bacteriana da Expressão Gênica , Mycobacterium tuberculosis , RNA Bacteriano , Transcriptoma , RNA Polimerases Dirigidas por DNA/metabolismo , Mycobacterium tuberculosis/genética , Mycobacterium tuberculosis/metabolismo , RNA Bacteriano/análise , RNA Bacteriano/biossíntese , RNA Bacteriano/genética , Transcriptoma/genética , Tuberculose/microbiologia , RNA Mensageiro/análise , RNA Mensageiro/biossíntese , RNA Mensageiro/genética , Sítio de Iniciação de Transcrição , Fator sigma/metabolismo , Ribossomos/metabolismo , Biossíntese de Proteínas
7.
Bioinformatics ; 40(3)2024 Mar 04.
Artigo em Inglês | MEDLINE | ID: mdl-38407414

RESUMO

MOTIVATION: Prediction and identification of core promoter elements and transcription factor binding sites is essential for understanding the mechanism of transcription initiation and deciphering the biological activity of a specific locus. Thus, there is a need for an up-to-date tool to detect and curate core promoter elements/motifs in any provided nucleotide sequences. RESULTS: Here, we introduce ElemeNT 2023-a new and enhanced version of the Elements Navigation Tool, which provides novel capabilities for assessing evolutionary conservation and for readily evaluating the quality of high-throughput transcription start site (TSS) datasets, leveraging preferential motif positioning. ElemeNT 2023 is accessible both as a fast web-based tool and via command line (no coding skills are required to run the tool). While this tool is focused on core promoter elements, it can also be used for searching any user-defined motif, including sequence-specific DNA binding sites. Furthermore, ElemeNT's CORE database, which contains predicted core promoter elements around annotated TSSs, is now expanded to cover 10 species, ranging from worms to human. In this applications note, we describe the new workflow and demonstrate a case study using ElemeNT 2023 for core promoter composition analysis of diverse species, revealing motif prevalence and highlighting evolutionary insights. We discuss how this tool facilitates the exploration of uncharted transcriptomic data, appraises TSS quality, and aids in designing synthetic promoters for gene expression optimization. Taken together, ElemeNT 2023 empowers researchers with comprehensive tools for meticulous analysis of sequence elements and gene expression strategies. AVAILABILITY AND IMPLEMENTATION: ElemeNT 2023 is freely available at https://www.juven-gershonlab.org/resources/element-v2023/. The source code and command line version of ElemeNT 2023 are available at https://github.com/OritAdato/ElemeNT. No coding skills are required to run the tool.


Assuntos
Software , Humanos , Regiões Promotoras Genéticas , Ligação Proteica , Sítio de Iniciação de Transcrição
8.
Int J Mol Sci ; 25(3)2024 Jan 25.
Artigo em Inglês | MEDLINE | ID: mdl-38338773

RESUMO

Since the discovery of physical peculiarities around transcription start sites (TSSs) and a site corresponding to the TATA box, research has revealed only the average features of these sites. Unsettled enigmas include the individual genes with these features and whether they relate to gene function. Herein, using 10 physical properties of DNA, including duplex DNA free energy, base stacking energy, protein-induced deformability, and stabilizing energy of Z-DNA, we clarified for the first time that approximately 97% of the promoters of 21,056 human protein-coding genes have distinctive physical properties around the TSS and/or position -27; of these, nearly 65% exhibited such properties at both sites. Furthermore, about 55% of the 21,056 genes had a minimum value of regional duplex DNA free energy within TSS-centered ±300 bp regions. Notably, distinctive physical properties within the promoters and free energies of the surrounding regions separated human protein-coding genes into five groups; each contained specific gene ontology (GO) terms. The group represented by immune response genes differed distinctly from the other four regarding the parameter of the free energies of the surrounding regions. A vital suggestion from this study is that physical-feature-based analyses of genomes may reveal new aspects of the organization and regulation of genes.


Assuntos
DNA , Humanos , Regiões Promotoras Genéticas , TATA Box/genética , Sítio de Iniciação de Transcrição
9.
G3 (Bethesda) ; 14(3)2024 Mar 06.
Artigo em Inglês | MEDLINE | ID: mdl-38253712

RESUMO

Transcriptional initiation is among the first regulated steps controlling eukaryotic gene expression. High-throughput profiling of fungal and animal genomes has revealed that RNA Polymerase II often initiates transcription in both directions at the promoter transcription start site, but generally only elongates productively into the gene body. Additionally, Pol II can initiate transcription in both directions at cis-regulatory elements such as enhancers. These bidirectional RNA Polymerase II initiation events can be observed directly with methods that capture nascent transcripts, and they are also revealed indirectly by the presence of transcription-associated histone modifications on both sides of the transcription start site or cis-regulatory elements. Previous studies have shown that nascent RNAs and transcription-associated histone modifications in the model plant Arabidopsis thaliana accumulate mainly in the gene body, suggesting that transcription does not initiate widely in the upstream direction from genes in this plant. We compared transcription-associated histone modifications and nascent transcripts at both transcription start sites and cis-regulatory elements in A. thaliana, Drosophila melanogaster, and Homo sapiens. Our results provide evidence for mostly unidirectional RNA Polymerase II initiation at both promoters and gene-proximal cis-regulatory elements of A. thaliana, whereas bidirectional transcription initiation is observed widely at promoters in both D. melanogaster and H. sapiens, as well as cis-regulatory elements in Drosophila. Furthermore, the distribution of transcription-associated histone modifications around transcription start sites in the Oryza sativa (rice) and Glycine max (soybean) genomes suggests that unidirectional transcription initiation is the norm in these genomes as well. These results suggest that there are fundamental differences in transcriptional initiation directionality between flowering plant and metazoan genomes, which are manifested as distinct patterns of chromatin modifications around RNA polymerase initiation sites.


Assuntos
Arabidopsis , Cromatina , Animais , Cromatina/genética , RNA Polimerase II/genética , RNA Polimerase II/metabolismo , Transcrição Gênica , Drosophila melanogaster/genética , Drosophila melanogaster/metabolismo , Arabidopsis/genética , Arabidopsis/metabolismo , Sítio de Iniciação de Transcrição , Plantas/genética
10.
Nat Struct Mol Biol ; 31(1): 190-202, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38177677

RESUMO

Transcription start site (TSS) selection is a key step in gene expression and occurs at many promoter positions over a wide range of efficiencies. Here we develop a massively parallel reporter assay to quantitatively dissect contributions of promoter sequence, nucleoside triphosphate substrate levels and RNA polymerase II (Pol II) activity to TSS selection by 'promoter scanning' in Saccharomyces cerevisiae (Pol II MAssively Systematic Transcript End Readout, 'Pol II MASTER'). Using Pol II MASTER, we measure the efficiency of Pol II initiation at 1,000,000 individual TSS sequences in a defined promoter context. Pol II MASTER confirms proposed critical qualities of S. cerevisiae TSS -8, -1 and +1 positions, quantitatively, in a controlled promoter context. Pol II MASTER extends quantitative analysis to surrounding sequences and determines that they tune initiation over a wide range of efficiencies. These results enabled the development of a predictive model for initiation efficiency based on sequence. We show that genetic perturbation of Pol II catalytic activity alters initiation efficiency mostly independently of TSS sequence, but selectively modulates preference for the initiating nucleotide. Intriguingly, we find that Pol II initiation efficiency is directly sensitive to guanosine-5'-triphosphate levels at the first five transcript positions and to cytosine-5'-triphosphate and uridine-5'-triphosphate levels at the second position genome wide. These results suggest individual nucleoside triphosphate levels can have transcript-specific effects on initiation, representing a cryptic layer of potential regulation at the level of Pol II biochemical properties. The results establish Pol II MASTER as a method for quantitative dissection of transcription initiation in eukaryotes.


Assuntos
Polifosfatos , RNA Polimerase II , Saccharomyces cerevisiae , RNA Polimerase II/metabolismo , Saccharomyces cerevisiae/metabolismo , Sequência de Bases , Sítio de Iniciação de Transcrição , Nucleosídeos , Transcrição Gênica , Guanosina Trifosfato
11.
Nucleic Acids Res ; 52(D1): D322-D333, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37956335

RESUMO

Transposable elements (TEs) are abundant in the genome and serve as crucial regulatory elements. Some TEs function as epigenetically regulated promoters, and these TE-derived transcription start sites (TSSs) play a crucial role in regulating genes associated with specific functions, such as cancer and embryogenesis. However, the lack of an accessible database that systematically gathers TE-derived TSS data is a current research gap. To address this, we established TE-TSS, an integrated data resource of human and mouse TE-derived TSSs (http://xozhanglab.com/TETSS). TE-TSS has compiled 2681 RNA sequencing datasets, spanning various tissues, cell lines and developmental stages. From these, we identified 5768 human TE-derived TSSs and 2797 mouse TE-derived TSSs, with 47% and 38% being experimentally validated, respectively. TE-TSS enables comprehensive exploration of TSS usage in diverse samples, providing insights into tissue-specific gene expression patterns and transcriptional regulatory elements. Furthermore, TE-TSS compares TE-derived TSS regions across 15 mammalian species, enhancing our understanding of their evolutionary and functional aspects. The establishment of TE-TSS facilitates further investigations into the roles of TEs in shaping the transcriptomic landscape and offers valuable resources for comprehending their involvement in diverse biological processes.


Assuntos
Elementos de DNA Transponíveis , Bases de Dados Genéticas , Sequências Reguladoras de Ácido Nucleico , Sítio de Iniciação de Transcrição , Animais , Humanos , Camundongos , Elementos de DNA Transponíveis/genética , Mamíferos/genética , Regiões Promotoras Genéticas , Análise de Sequência de RNA , Internet
12.
Nucleic Acids Res ; 52(2): e7, 2024 Jan 25.
Artigo em Inglês | MEDLINE | ID: mdl-37994784

RESUMO

Precise detection of the transcriptional start site (TSS) is a key for characterizing transcriptional regulation of genes and for annotation of newly sequenced genomes. Here, we describe the development of an improved method, designated 'TSS-seq2.' This method is an iterative improvement of TSS-seq, a previously published enzymatic cap-structure conversion method to detect TSSs in base sequences. By modifying the original procedure, including by introducing split ligation at the key cap-selection step, the yield and the accuracy of the reaction has been substantially improved. For example, TSS-seq2 can be conducted using as little as 5 ng of total RNA with an overall accuracy of 96%; this yield a less-biased and more precise detection of TSS. We then applied TSS-seq2 for TSS analysis of four plant species that had not yet been analyzed by any previous TSS method.


Assuntos
Análise de Sequência de RNA , Sítio de Iniciação de Transcrição , Sequência de Bases , Regulação da Expressão Gênica , Regiões Promotoras Genéticas , Análise de Sequência de RNA/métodos
14.
Nat Commun ; 14(1): 7240, 2023 11 09.
Artigo em Inglês | MEDLINE | ID: mdl-37945584

RESUMO

Five-prime single-cell RNA-seq (scRNA-seq) has been widely employed to profile cellular transcriptomes, however, its power of analysing transcription start sites (TSS) has not been fully utilised. Here, we present a computational method suite, CamoTSS, to precisely identify TSS and quantify its expression by leveraging the cDNA on read 1, which enables effective detection of alternative TSS usage. With various experimental data sets, we have demonstrated that CamoTSS can accurately identify TSS and the detected alternative TSS usages showed strong specificity in different biological processes, including cell types across human organs, the development of human thymus, and cancer conditions. As evidenced in nasopharyngeal cancer, alternative TSS usage can also reveal regulatory patterns including systematic TSS dysregulations.


Assuntos
Neoplasias Nasofaríngeas , Humanos , Sítio de Iniciação de Transcrição , Análise da Expressão Gênica de Célula Única , Transcriptoma/genética , Fenótipo , Análise de Célula Única/métodos
15.
Nat Struct Mol Biol ; 30(12): 1970-1984, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37996663

RESUMO

Global changes in transcriptional regulation and RNA metabolism are crucial features of cancer development. However, little is known about the role of the core promoter in defining transcript identity and post-transcriptional fates, a potentially crucial layer of transcriptional regulation in cancer. In this study, we use CAGE-seq analysis to uncover widespread use of dual-initiation promoters in which non-canonical, first-base-cytosine (C) transcription initiation occurs alongside first-base-purine initiation across 59 human cancers and healthy tissues. C-initiation is often followed by a 5' terminal oligopyrimidine (5'TOP) sequence, dramatically increasing the range of genes potentially subjected to 5'TOP-associated post-transcriptional regulation. We show selective, dynamic switching between purine and C-initiation site usage, indicating transcription initiation-level regulation in cancers. We additionally detail global metabolic changes in C-initiation transcripts that mark differentiation status, proliferative capacity, radiosensitivity, and response to irradiation and to PI3K-Akt-mTOR and DNA damage pathway-targeted radiosensitization therapies in colorectal cancer organoids and cancer cell lines and tissues.


Assuntos
Fosfatidilinositol 3-Quinases , RNA , Humanos , Sítio de Iniciação de Transcrição , RNA/genética , Proliferação de Células , Purinas
16.
PLoS Comput Biol ; 19(11): e1011491, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37983292

RESUMO

Core promoters are stretches of DNA at the beginning of genes that contain information that facilitates the binding of transcription initiation complexes. Different functional subsets of genes have core promoters with distinct architectures and characteristic motifs. Some of these motifs inform the selection of transcription start sites (TSS). By discovering motifs with fixed distances from known TSS positions, we could in principle classify promoters into different functional groups. Due to the variability and overlap of architectures, promoter classification is a difficult task that requires new approaches. In this study, we present a new method based on non-negative matrix factorisation (NMF) and the associated software called seqArchR that clusters promoter sequences based on their motifs at near-fixed distances from a reference point, such as TSS. When combined with experimental data from CAGE, seqArchR can efficiently identify TSS-directing motifs, including known ones like TATA, DPE, and nucleosome positioning signal, as well as novel lineage-specific motifs and the function of genes associated with them. By using seqArchR on developmental time courses, we reveal how relative use of promoter architectures changes over time with stage-specific expression. seqArchR is a powerful tool for initial genome-wide classification and functional characterisation of promoters. Its use cases are more general: it can also be used to discover any motifs at near-fixed distances from a reference point, even if they are present in only a small subset of sequences.


Assuntos
Algoritmos , Software , Regiões Promotoras Genéticas/genética , Sítio de Iniciação de Transcrição , Nucleossomos
17.
Nature ; 622(7981): 173-179, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-37731000

RESUMO

Lysine residues in histones and other proteins can be modified by post-translational modifications that encode regulatory information1. Lysine acetylation and methylation are especially important for regulating chromatin and gene expression2-4. Pathways involving these post-translational modifications are targets for clinically approved therapeutics to treat human diseases. Lysine methylation and acetylation are generally assumed to be mutually exclusive at the same residue. Here we report cellular lysine residues that are both methylated and acetylated on the same side chain to form Nε-acetyl-Nε-methyllysine (Kacme). We show that Kacme is found on histone H4 (H4Kacme) across a range of species and across mammalian tissues. Kacme is associated with marks of active chromatin, increased transcriptional initiation and is regulated in response to biological signals. H4Kacme can be installed by enzymatic acetylation of monomethyllysine peptides and is resistant to deacetylation by some HDACs in vitro. Kacme can be bound by chromatin proteins that recognize modified lysine residues, as we demonstrate with the crystal structure of acetyllysine-binding protein BRD2 bound to a histone H4Kacme peptide. These results establish Kacme as a cellular post-translational modification with the potential to encode information distinct from methylation and acetylation alone and demonstrate that Kacme has all the hallmarks of a post-translational modification with fundamental importance to chromatin biology.


Assuntos
Acetilação , Cromatina , Lisina , Metilação , Processamento de Proteína Pós-Traducional , Sítio de Iniciação de Transcrição , Animais , Humanos , Cromatina/química , Cromatina/genética , Cromatina/metabolismo , Histonas/química , Histonas/metabolismo , Lisina/análogos & derivados , Lisina/química , Lisina/metabolismo , Peptídeos/química , Peptídeos/metabolismo , Histona Desacetilases/metabolismo
18.
J Virol ; 97(9): e0081823, 2023 09 28.
Artigo em Inglês | MEDLINE | ID: mdl-37681957

RESUMO

HIV-1 uses heterogeneous transcription start sites (TSSs) to generate two RNA 5´ isoforms that adopt radically different structures and perform distinct replication functions. Although these RNAs differ in length by only two bases, exclusively, the shorter RNA is encapsidated while the longer RNA is excluded from virions and provides intracellular functions. The current study examined TSS usage and packaging selectivity for a broad range of retroviruses and found that heterogeneous TSS usage was a conserved feature of all tested HIV-1 strains, but all other retroviruses examined displayed unique TSSs. Phylogenetic comparisons and chimeric viruses' properties provided evidence that this mechanism of RNA fate determination was an innovation of the HIV-1 lineage, with determinants mapping to core promoter elements. Fine-tuning differences between HIV-1 and HIV-2, which uses a unique TSS, implicated purine residue positioning plus a specific TSS-adjacent dinucleotide in specifying multiplicity of TSS usage. Based on these findings, HIV-1 expression constructs were generated that differed from the parental strain by only two point mutations yet each expressed only one of HIV-1's two RNAs. Replication defects of the variant with only the presumptive founder TSS were less severe than those for the virus with only the secondary start site. IMPORTANCE Retroviruses use RNA both to encode their proteins and to serve in place of DNA as their genomes. A recent surprising discovery was that the genomic RNAs and messenger RNAs of HIV-1 are not identical but instead differ subtly on one of their ends. These differences enable the functional separation of HIV-1 RNAs into genome and messenger roles. In this report, we examined a broad collection of HIV-1-related viruses and discovered that each produced only one end class of RNA, and thus must differ from HIV-1 in how they specify RNA fates. By comparing regulatory signals, we generated virus variants that pinpointed the determinants of HIV-1 RNA fates, as well as HIV-1 variants that produced only one or the other functional class of RNA. Competition and replication assays confirmed that HIV-1 has evolved to rely on the coordinated actions of both its RNA forms.


Assuntos
HIV-1 , RNA Viral , Sítio de Iniciação de Transcrição , HIV-1/genética , Filogenia , Retroviridae/genética , Regiões Promotoras Genéticas , RNA Viral/genética
19.
J Biol Chem ; 299(9): 105130, 2023 09.
Artigo em Inglês | MEDLINE | ID: mdl-37543366

RESUMO

Long noncoding RNAs (lncRNAs) are increasingly being recognized as modulators in various biological processes. However, due to their low expression, their systematic characterization is difficult to determine. Here, we performed transcript annotation by a newly developed computational pipeline, termed RNA-seq and small RNA-seq combined strategy (RSCS), in a wide variety of cellular contexts. Thousands of high-confidence potential novel transcripts were identified by the RSCS, and the reliability of the transcriptome was verified by analysis of transcript structure, base composition, and sequence complexity. Evidenced by the length comparison, the frequency of the core promoter and the polyadenylation signal motifs, and the locations of transcription start and end sites, the transcripts appear to be full length. Furthermore, taking advantage of our strategy, we identified a large number of endogenous retrovirus-associated lncRNAs, and a novel endogenous retrovirus-lncRNA that was functionally involved in control of Yap1 expression and essential for early embryogenesis was identified. In summary, the RSCS can generate a more complete and precise transcriptome, and our findings greatly expanded the transcriptome annotation for the mammalian community.


Assuntos
Anotação de Sequência Molecular , RNA Longo não Codificante , RNA-Seq , Animais , Desenvolvimento Embrionário/genética , Mamíferos/embriologia , Mamíferos/genética , Anotação de Sequência Molecular/métodos , Regiões Promotoras Genéticas/genética , Reprodutibilidade dos Testes , Retroviridae/genética , RNA Longo não Codificante/genética , RNA-Seq/métodos , Sítio de Iniciação de Transcrição , Transcriptoma/genética , Proteínas de Sinalização YAP/genética , Proteínas de Sinalização YAP/metabolismo
20.
BMC Microbiol ; 23(1): 243, 2023 08 31.
Artigo em Inglês | MEDLINE | ID: mdl-37653502

RESUMO

Analysis of genome wide transcription start sites (TSSs) revealed an unexpected complexity since not only canonical TSS of annotated genes are recognized by RNA polymerase. Non-canonical TSS were detected antisense to, or within, annotated genes as well new intergenic (orphan) TSS, not associated with known genes. Previously, it was hypothesized that many such signals represent noise or pervasive transcription, not associated with a biological function. Here, a modified Cappable-seq protocol allows determining the primary transcriptome of the enterohemorrhagic E. coli O157:H7 EDL933 (EHEC). We used four different growth media, both in exponential and stationary growth phase, replicated each thrice. This yielded 19,975 EHEC canonical and non-canonical TSS, which reproducibly occurring in three biological replicates. This questions the hypothesis of experimental noise or pervasive transcription. Accordingly, conserved promoter motifs were found upstream indicating proper TSSs. More than 50% of 5,567 canonical and between 32% and 47% of 10,355 non-canonical TSS were differentially expressed in different media and growth phases, providing evidence for a potential biological function also of non-canonical TSS. Thus, reproducible and environmentally regulated expression suggests that a substantial number of the non-canonical TSSs may be of unknown function rather than being the result of noise or pervasive transcription.


Assuntos
Escherichia coli Êntero-Hemorrágica , Escherichia coli O157 , Escherichia coli O157/genética , Sítio de Iniciação de Transcrição , Ciclo Celular , Meios de Cultura
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA