Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 2.517
Filter
Add more filters

Publication year range
1.
Cell ; 186(24): 5220-5236.e16, 2023 11 22.
Article in English | MEDLINE | ID: mdl-37944511

ABSTRACT

The Sc2.0 project is building a eukaryotic synthetic genome from scratch. A major milestone has been achieved with all individual Sc2.0 chromosomes assembled. Here, we describe the consolidation of multiple synthetic chromosomes using advanced endoreduplication intercrossing with tRNA expression cassettes to generate a strain with 6.5 synthetic chromosomes. The 3D chromosome organization and transcript isoform profiles were evaluated using Hi-C and long-read direct RNA sequencing. We developed CRISPR Directed Biallelic URA3-assisted Genome Scan, or "CRISPR D-BUGS," to map phenotypic variants caused by specific designer modifications, known as "bugs." We first fine-mapped a bug in synthetic chromosome II (synII) and then discovered a combinatorial interaction associated with synIII and synX, revealing an unexpected genetic interaction that links transcriptional regulation, inositol metabolism, and tRNASerCGA abundance. Finally, to expedite consolidation, we employed chromosome substitution to incorporate the largest chromosome (synIV), thereby consolidating >50% of the Sc2.0 genome in one strain.


Subject(s)
Chromosomes, Artificial, Yeast , Genome, Fungal , Saccharomyces cerevisiae , Base Sequence , Chromosomes/genetics , Saccharomyces cerevisiae/genetics , Synthetic Biology
2.
Cell ; 177(7): 1797-1813.e18, 2019 06 13.
Article in English | MEDLINE | ID: mdl-31104839

ABSTRACT

Accurate regulation of mRNA termination is required for correct gene expression. Here, we describe a role for SCAF4 and SCAF8 as anti-terminators, suppressing the use of early, alternative polyadenylation (polyA) sites. The SCAF4/8 proteins bind the hyper-phosphorylated RNAPII C-terminal repeat domain (CTD) phosphorylated on both Ser2 and Ser5 and are detected at early, alternative polyA sites. Concomitant knockout of human SCAF4 and SCAF8 results in altered polyA selection and subsequent early termination, leading to expression of truncated mRNAs and proteins lacking functional domains and is cell lethal. While SCAF4 and SCAF8 work redundantly to suppress early mRNA termination, they also have independent, non-essential functions. SCAF8 is an RNAPII elongation factor, whereas SCAF4 is required for correct termination at canonical, distal transcription termination sites in the presence of SCAF8. Together, SCAF4 and SCAF8 coordinate the transition between elongation and termination, ensuring correct polyA site selection and RNAPII transcriptional termination in human cells.


Subject(s)
RNA Polymerase II/metabolism , RNA, Messenger/biosynthesis , RNA-Binding Proteins/metabolism , Serine-Arginine Splicing Factors/metabolism , Transcription Elongation, Genetic , Transcription Termination, Genetic , HEK293 Cells , Humans , Poly A/genetics , Poly A/metabolism , Protein Domains , RNA Polymerase II/genetics , RNA, Messenger/genetics , RNA-Binding Proteins/genetics , Serine-Arginine Splicing Factors/genetics
3.
Immunity ; 56(8): 1939-1954.e12, 2023 08 08.
Article in English | MEDLINE | ID: mdl-37442134

ABSTRACT

Lung infection during severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) via the angiotensin-I-converting enzyme 2 (ACE2) receptor induces a cytokine storm. However, the precise mechanisms involved in severe COVID-19 pneumonia are unknown. Here, we showed that interleukin-10 (IL-10) induced the expression of ACE2 in normal alveolar macrophages, causing them to become vectors for SARS-CoV-2. The inhibition of this system in hamster models attenuated SARS-CoV-2 pathogenicity. Genome-wide association and quantitative trait locus analyses identified a IFNAR2-IL10RB readthrough transcript, COVID-19 infectivity-enhancing dual receptor (CiDRE), which was highly expressed in patients harboring COVID-19 risk variants at the IFNAR2 locus. We showed that CiDRE exerted synergistic effects via the IL-10-ACE2 axis in alveolar macrophages and functioned as a decoy receptor for type I interferons. Collectively, our data show that high IL-10 and CiDRE expression are potential risk factors for severe COVID-19. Thus, IL-10R and CiDRE inhibitors might be useful COVID-19 therapies.


Subject(s)
COVID-19 , Humans , COVID-19/genetics , SARS-CoV-2 , Angiotensin-Converting Enzyme 2/genetics , Interleukin-10/genetics , Macrophages, Alveolar/metabolism , Genome-Wide Association Study , Peptidyl-Dipeptidase A/metabolism
4.
Cell ; 168(5): 843-855.e13, 2017 02 23.
Article in English | MEDLINE | ID: mdl-28215706

ABSTRACT

The transcription-related DNA damage response was analyzed on a genome-wide scale with great spatial and temporal resolution. Upon UV irradiation, a slowdown of transcript elongation and restriction of gene activity to the promoter-proximal ∼25 kb is observed. This is associated with a shift from expression of long mRNAs to shorter isoforms, incorporating alternative last exons (ALEs) that are more proximal to the transcription start site. Notably, this includes a shift from a protein-coding ASCC3 mRNA to a shorter ALE isoform of which the RNA, rather than an encoded protein, is critical for the eventual recovery of transcription. The non-coding ASCC3 isoform counteracts the function of the protein-coding isoform, indicating crosstalk between them. Thus, the ASCC3 gene expresses both coding and non-coding transcript isoforms with opposite effects on transcription recovery after UV-induced DNA damage.


Subject(s)
Alternative Splicing/radiation effects , DNA Helicases/genetics , RNA, Untranslated/genetics , Transcription, Genetic , Ultraviolet Rays , Cell Line , Exons , Humans , RNA Polymerase II/metabolism , RNA, Messenger/genetics , RNA, Messenger/metabolism , Transcription Elongation, Genetic/radiation effects , Transcription Initiation, Genetic/radiation effects
5.
Mol Cell ; 83(9): 1474-1488.e8, 2023 05 04.
Article in English | MEDLINE | ID: mdl-37116494

ABSTRACT

Transcriptional pauses mediate regulation of RNA biogenesis. DNA-encoded pause signals trigger pausing by stabilizing RNA polymerase (RNAP) swiveling and inhibiting DNA translocation. The N-terminal domain (NGN) of the only universal transcription factor, NusG/Spt5, modulates pausing through contacts to RNAP and DNA. Pro-pausing NusGs enhance pauses, whereas anti-pausing NusGs suppress pauses. Little is known about pausing and NusG in the human pathogen Mycobacterium tuberculosis (Mtb). We report that MtbNusG is pro-pausing. MtbNusG captures paused, swiveled RNAP by contacts to the RNAP protrusion and nontemplate-DNA wedged between the NGN and RNAP gate loop. In contrast, anti-pausing Escherichia coli (Eco) NGN contacts the MtbRNAP gate loop, inhibiting swiveling and pausing. Using CRISPR-mediated genetics, we show that pro-pausing NGN is required for mycobacterial fitness. Our results define an essential function of mycobacterial NusG and the structural basis of pro- versus anti-pausing NusG activity, with broad implications for the function of all NusG orthologs.


Subject(s)
Escherichia coli Proteins , Mycobacterium tuberculosis , Humans , Transcription Factors/genetics , Transcription Factors/chemistry , Transcription, Genetic , Mycobacterium tuberculosis/genetics , Mycobacterium tuberculosis/metabolism , Escherichia coli Proteins/genetics , DNA-Directed RNA Polymerases/metabolism , Escherichia coli/genetics , Escherichia coli/metabolism , DNA , Peptide Elongation Factors/metabolism
6.
Mol Cell ; 82(14): 2604-2617.e8, 2022 07 21.
Article in English | MEDLINE | ID: mdl-35654044

ABSTRACT

Stress-induced cleavage of transfer RNAs (tRNAs) into tRNA-derived fragments (tRFs) occurs across organisms from yeast to humans; yet, its mechanistic underpinnings and pathological consequences remain poorly defined. Small RNA profiling revealed increased abundance of a cysteine tRNA fragment (5'-tRFCys) during breast cancer metastatic progression. 5'-tRFCys was required for efficient breast cancer metastatic lung colonization and cancer cell survival. We identified Nucleolin as the direct binding partner of 5'-tRFCys. 5'-tRFCys promoted the oligomerization of Nucleolin and its bound metabolic transcripts Mthfd1l and Pafah1b1 into a higher-order transcript stabilizing ribonucleoprotein complex, which protected these transcripts from exonucleolytic degradation. Consistent with this, Mthfd1l and Pafah1b1 mediated pro-metastatic and metabolic effects downstream of 5'-tRFCys-impacting folate, one-carbon, and phosphatidylcholine metabolism. Our findings reveal that a tRF can promote oligomerization of an RNA-binding protein into a transcript stabilizing ribonucleoprotein complex, thereby driving specific metabolic pathways underlying cancer progression.


Subject(s)
Breast Neoplasms , RNA, Transfer , Breast Neoplasms/genetics , Female , Humans , Phosphoproteins , RNA, Messenger/genetics , RNA, Transfer/metabolism , RNA-Binding Proteins/genetics , RNA-Binding Proteins/metabolism , Ribonucleoproteins/genetics , Nucleolin
7.
Genes Dev ; 35(11-12): 899-913, 2021 06.
Article in English | MEDLINE | ID: mdl-34016691

ABSTRACT

In mammals, a set of core clock genes form transcription-translation feedback loops to generate circadian oscillations. We and others recently identified a novel transcript at the Period2 (Per2) locus that is transcribed from the antisense strand of Per2 This transcript, Per2AS, is expressed rhythmically and antiphasic to Per2 mRNA, leading to our hypothesis that Per2AS and Per2 mutually inhibit each other's expression and form a double negative feedback loop. By perturbing the expression of Per2AS, we found that Per2AS transcription, but not transcript, represses Per2 However, Per2 does not repress Per2AS, as Per2 knockdown led to a decrease in the Per2AS level, indicating that Per2AS forms a single negative feedback loop with Per2 and maintains the level of Per2 within the oscillatory range. Per2AS also regulates the amplitude of the circadian clock, and this function cannot be solely explained through its interaction with Per2, as Per2 knockdown does not recapitulate the phenotypes of Per2AS perturbation. Overall, our data indicate that Per2AS is an important regulatory molecule in the mammalian circadian clock machinery. Our work also supports the idea that antisense transcripts of core clock genes constitute a common feature of circadian clocks, as they are found in other organisms.


Subject(s)
Circadian Clocks/genetics , RNA, Antisense/genetics , RNA, Antisense/metabolism , Animals , Feedback, Physiological , Gene Knockdown Techniques , Mice , Period Circadian Proteins/genetics
8.
Trends Genet ; 40(1): 83-93, 2024 01.
Article in English | MEDLINE | ID: mdl-37953195

ABSTRACT

Recent technological and algorithmic advances enable single-cell transcriptomic analysis with remarkable depth and breadth. Nonetheless, a persistent challenge is the compromise between the ability to profile high numbers of cells and the achievement of full-length transcript coverage. Currently, the field is progressing and developing new and creative solutions that improve cellular throughput, gene detection sensitivity and full-length transcript capture. Furthermore, long-read sequencing approaches for single-cell transcripts are breaking frontiers that have previously blocked full transcriptome characterization. We here present a comprehensive overview of available options for single-cell transcriptome profiling, highlighting the key advantages and disadvantages of each approach.


Subject(s)
High-Throughput Nucleotide Sequencing , Transcriptome , Transcriptome/genetics , Gene Expression Profiling , Sequence Analysis, RNA
9.
Am J Hum Genet ; 111(8): 1524-1543, 2024 Aug 08.
Article in English | MEDLINE | ID: mdl-39053458

ABSTRACT

Gene misexpression is the aberrant transcription of a gene in a context where it is usually inactive. Despite its known pathological consequences in specific rare diseases, we have a limited understanding of its wider prevalence and mechanisms in humans. To address this, we analyzed gene misexpression in 4,568 whole-blood bulk RNA sequencing samples from INTERVAL study blood donors. We found that while individual misexpression events occur rarely, in aggregate they were found in almost all samples and a third of inactive protein-coding genes. Using 2,821 paired whole-genome and RNA sequencing samples, we identified that misexpression events are enriched in cis for rare structural variants. We established putative mechanisms through which a subset of SVs lead to gene misexpression, including transcriptional readthrough, transcript fusions, and gene inversion. Overall, we develop misexpression as a type of transcriptomic outlier analysis and extend our understanding of the variety of mechanisms by which genetic variants can influence gene expression.


Subject(s)
Gene Expression Regulation , Humans , Sequence Analysis, RNA , Genetic Variation , Genomic Structural Variation/genetics , Transcriptome/genetics , Blood Donors
10.
Mol Cell ; 76(1): 57-69.e9, 2019 10 03.
Article in English | MEDLINE | ID: mdl-31519522

ABSTRACT

Although correlations between RNA polymerase II (RNAPII) transcription stress, R-loops, and genome instability have been established, the mechanisms underlying these connections remain poorly understood. Here, we used a mutant version of the transcription elongation factor TFIIS (TFIISmut), aiming to specifically induce increased levels of RNAPII pausing, arrest, and/or backtracking in human cells. Indeed, TFIISmut expression results in slower elongation rates, relative depletion of polymerases from the end of genes, and increased levels of stopped RNAPII; it affects mRNA splicing and termination as well. Remarkably, TFIISmut expression also dramatically increases R-loops, which may form at the anterior end of backtracked RNAPII and trigger genome instability, including DNA strand breaks. These results shed light on the relationship between transcription stress and R-loops and suggest that different classes of R-loops may exist, potentially with distinct consequences for genome stability.


Subject(s)
Genomic Instability , R-Loop Structures , RNA, Messenger/genetics , Stress, Physiological , Transcription, Genetic , Transcriptional Elongation Factors/metabolism , Cell Line, Tumor , HEK293 Cells , Humans , Mutation , RNA Polymerase II/metabolism , RNA Splicing , RNA, Messenger/chemistry , RNA, Messenger/metabolism , Structure-Activity Relationship , Transcriptional Elongation Factors/chemistry , Transcriptional Elongation Factors/genetics
11.
Mol Cell ; 75(2): 340-356.e10, 2019 07 25.
Article in English | MEDLINE | ID: mdl-31253575

ABSTRACT

The microRNAs encoded by the miR-17∼92 polycistron are commonly overexpressed in cancer and orchestrate a wide range of oncogenic functions. Here, we identify a mechanism for miR-17∼92 oncogenic function through the disruption of endogenous microRNA (miRNA) processing. We show that, upon oncogenic overexpression of the miR-17∼92 primary transcript (pri-miR-17∼92), the microprocessor complex remains associated with partially processed intermediates that aberrantly accumulate. These intermediates reflect a series of hierarchical and conserved steps in the early processing of the pri-miR-17∼92 transcript. Encumbrance of the microprocessor by miR-17∼92 intermediates leads to the broad but selective downregulation of co-expressed polycistronic miRNAs, including miRNAs derived from tumor-suppressive miR-34b/c and from the Dlk1-Dio3 polycistrons. We propose that the identified steps of polycistronic miR-17∼92 biogenesis contribute to the oncogenic re-wiring of gene regulation networks. Our results reveal previously unappreciated functional paradigms for polycistronic miRNAs in cancer.


Subject(s)
Carcinogenesis/genetics , MicroRNAs/genetics , RNA Processing, Post-Transcriptional/genetics , Calcium-Binding Proteins/genetics , Gene Expression Regulation, Neoplastic/genetics , Humans , Iodide Peroxidase/genetics , Membrane Proteins/genetics , MicroRNAs/biosynthesis , Nucleic Acid Conformation
12.
Trends Genet ; 39(1): 31-33, 2023 01.
Article in English | MEDLINE | ID: mdl-36207147

ABSTRACT

Disturbance in the regulation of transcript structure plays a crucial role in human disease. In a recent study, Glinos et al. characterized allele-specific transcript alterations in long-read RNA sequencing (RNA-seq) data derived from multiple human tissues and provide a high-resolution view of how disease-associated genetic variants affect transcript structure.


Subject(s)
RNA , Transcriptome , Humans , Transcriptome/genetics , Alleles , RNA/genetics , Sequence Analysis, RNA , Base Sequence , High-Throughput Nucleotide Sequencing , Gene Expression Profiling
13.
Trends Genet ; 39(4): 320-333, 2023 04.
Article in English | MEDLINE | ID: mdl-36681580

ABSTRACT

Studies using highly sensitive targeted RNA enrichment methods have shown that a large portion of the human transcriptome remains to be discovered and that most of the genome is transcribed in a complex, interleaved fashion characterized by a complex web of transcripts emanating from protein coding and noncoding loci. These results resonate with those from single-cell transcriptome profiling endeavors that reveal the existence of multiple novel, cell type-specific transcripts and clearly demonstrate that our understanding of the complexities of the human transcriptome is far from being complete. Here, we review the current status of the targeted RNA enrichment techniques, their application to the discovery of novel cell type-specific transcripts, and their impact on our understanding of the human genome and transcriptome.


Subject(s)
RNA, Long Noncoding , Transcriptome , Animals , Humans , Transcriptome/genetics , RNA/genetics , Sequence Analysis, RNA/methods , Gene Expression Profiling/methods , Genome, Human , RNA, Long Noncoding/genetics , Mammals/genetics
14.
Development ; 150(7)2023 04 01.
Article in English | MEDLINE | ID: mdl-36975404

ABSTRACT

Spermatogenic cells express more alternatively spliced RNAs than most whole tissues; however, the regulation of these events remains unclear. Here, we have characterized the function of a testis-specific IQ motif-containing H gene (Iqch) using a mutant mouse model. We found that Iqch is essential for the specific expression of RNA isoforms during spermatogenesis. Using immunohistochemistry of the testis, we noted that Iqch was expressed mainly in the nucleus of spermatocyte and spermatid, where IQCH appeared juxtaposed with SRRM2 and ERSP1 in the nuclear speckles, suggesting that interactions among these proteins regulate alternative splicing (AS). Using RNA-seq, we found that mutant Iqch produces alterations in gene expression, including the clear downregulation of testis-specific lncRNAs and protein-coding genes at the spermatid stage, and AS modifications - principally increased intron retention - resulting in complete male infertility. Interestingly, we identified previously unreported spliced transcripts in the wild-type testis, while mutant Iqch modified the expression and use of hundreds of RNA isoforms, favouring the expression of the canonical form. This suggests that Iqch is part of a splicing control mechanism, which is essential in germ cell biology.


Subject(s)
RNA Isoforms , Testis , Animals , Mice , Male , Testis/metabolism , RNA Isoforms/metabolism , Spermatogenesis/genetics , Spermatids/metabolism , Protein Isoforms/genetics , Protein Isoforms/metabolism
15.
Mol Cell ; 72(1): 10-17, 2018 10 04.
Article in English | MEDLINE | ID: mdl-30290147

ABSTRACT

Transcript buffering involves reciprocal adjustments between overall rates in mRNA synthesis and degradation to maintain similar cellular concentrations of mRNAs. This phenomenon was first discovered in yeast and encompasses coordination between the nuclear and cytoplasmic compartments. Transcript buffering was revealed by novel methods for pulse labeling of RNA to determine in vivo synthesis and degradation rates. In this Perspective, we discuss the current knowledge of transcript buffering. Emphasis is placed on the future challenges to determine the nature and directionality of the buffering signals, the generality of transcript buffering beyond yeast, and the molecular mechanisms responsible for this balancing.


Subject(s)
RNA Stability/genetics , RNA, Messenger/biosynthesis , Transcription, Genetic , Cell Nucleus/genetics , Cytoplasm/genetics , RNA Caps/genetics , RNA, Messenger/genetics , Saccharomyces cerevisiae/genetics
16.
Proc Natl Acad Sci U S A ; 120(39): e2300348120, 2023 09 26.
Article in English | MEDLINE | ID: mdl-37733738

ABSTRACT

The intensity of muscle contraction, and therefore movement vigor, needs to be adaptable to enable complex motor behaviors. This can be achieved by adjusting the properties of motor neurons, which form the final common pathway for all motor output from the central nervous system. Here, we identify roles for a neuropeptide, cocaine- and amphetamine-regulated transcript (CART), in the control of movement vigor. We reveal distinct but parallel mechanisms by which CART and acetylcholine, both released at C bouton synapses on motor neurons, selectively amplify the output of subtypes of motor neurons that are recruited during intense movement. We find that mice with broad genetic deletion of CART or selective elimination of acetylcholine from C boutons exhibit deficits in behavioral tasks that require higher levels of motor output. Overall, these data uncover spinal modulatory mechanisms that control movement vigor to support movements that require a high degree of muscle force.


Subject(s)
Acetylcholine , Synapses , Animals , Mice , Presynaptic Terminals , Motor Neurons , Central Nervous System
17.
Plant J ; 117(5): 1614-1634, 2024 Mar.
Article in English | MEDLINE | ID: mdl-38047591

ABSTRACT

Ribosome profiling (Ribo-seq) is a powerful method for the deep analysis of translation mechanisms and regulatory circuits during gene expression. Extraction and sequencing of ribosome-protected fragments (RPFs) and parallel RNA-seq yields genome-wide insight into translational dynamics and post-transcriptional control of gene expression. Here, we provide details on the Ribo-seq method and the subsequent analysis with the unicellular model alga Chlamydomonas reinhardtii (Chlamydomonas) for generating high-resolution data covering more than 10 000 different transcripts. Detailed analysis of the ribosomal offsets on transcripts uncovers presumable transition states during translocation of elongating ribosomes within the 5' and 3' sections of transcripts and characteristics of eukaryotic translation termination, which are fundamentally distinct for chloroplast translation. In chloroplasts, a heterogeneous RPF size distribution along the coding sequence indicates specific regulatory phases during protein synthesis. For example, local accumulation of small RPFs correlates with local slowdown of psbA translation, possibly uncovering an uncharacterized regulatory step during PsbA/D1 synthesis. Further analyses of RPF distribution along specific cytosolic transcripts revealed characteristic patterns of translation elongation exemplified for the major light-harvesting complex proteins, LHCs. By providing high-quality datasets for all subcellular genomes and attaching our data to the Chlamydomonas reference genome, we aim to make ribosome profiles easily accessible for the broad research community. The data can be browsed without advanced bioinformatic background knowledge for translation output levels of specific genes and their splice variants and for monitoring genome annotation.


Subject(s)
Chlamydomonas , Ribosome Profiling , Chlamydomonas/genetics , Chlamydomonas/metabolism , RNA, Messenger/genetics , RNA, Messenger/metabolism , Ribosomes/genetics , Ribosomes/metabolism , Protein Biosynthesis , Gene Expression Profiling
18.
Plant J ; 119(1): 432-444, 2024 Jul.
Article in English | MEDLINE | ID: mdl-38635415

ABSTRACT

Thiamine functions as a crucial activator modulating plant health and broad-spectrum stress tolerances. However, the role of thiamine in regulating plant virus infection is largely unknown. Here, we report that the multifunctional 17K protein encoded by barley yellow dwarf virus-GAV (BYDV-GAV) interacted with barley pyrimidine synthase (HvTHIC), a key enzyme in thiamine biosynthesis. HvTHIC was found to be localized in chloroplast via an N-terminal 74-amino acid domain. However, the 17K-HvTHIC interaction restricted HvTHIC targeting to chloroplasts and triggered autophagy-mediated HvTHIC degradation. Upon BYDV-GAV infection, the expression of the HvTHIC gene was significantly induced, and this was accompanied by accumulation of thiamine and salicylic acid. Silencing of HvTHIC expression promoted BYDV-GAV accumulation. Transcriptomic analysis of HvTHIC silenced and non-silenced barley plants showed that the differentially expressed genes were mainly involved in plant-pathogen interaction, plant hormone signal induction, phenylpropanoid biosynthesis, starch and sucrose metabolism, photosynthesis-antenna protein, and MAPK signaling pathway. Thiamine treatment enhanced barley resistance to BYDV-GAV. Taken together, our findings reveal a molecular mechanism underlying how BYDV impedes thiamine biosynthesis to uphold viral infection in plants.


Subject(s)
Hordeum , Plant Diseases , Plant Proteins , Thiamine , Hordeum/virology , Hordeum/genetics , Hordeum/metabolism , Thiamine/metabolism , Thiamine/biosynthesis , Plant Diseases/virology , Plant Diseases/genetics , Plant Proteins/genetics , Plant Proteins/metabolism , Luteovirus/physiology , Gene Expression Regulation, Plant , Viral Proteins/metabolism , Viral Proteins/genetics , Chloroplasts/metabolism , Salicylic Acid/metabolism , Host-Pathogen Interactions , Disease Resistance/genetics
19.
Plant J ; 2024 Aug 15.
Article in English | MEDLINE | ID: mdl-39145419

ABSTRACT

Accurate quantification of gene and transcript-specific expression, with the underlying knowledge of precise transcript isoforms, is crucial to understanding many biological processes. Analysis of RNA sequencing data has benefited from the development of alignment-free algorithms which enhance the precision and speed of expression analysis. However, such algorithms require a reference transcriptome. Here we generate a reference transcript dataset (LsRTDv1) for lettuce (cv. Saladin), combining long- and short-read sequencing with publicly available transcriptome annotations, and filtering to keep only transcripts with high-confidence splice junctions and transcriptional start and end sites. LsRTDv1 identifies novel genes (mostly long non-coding RNAs) and increases the number of transcript isoforms per gene in the lettuce genome from 1.4 to 2.7. We show that LsRTDv1 significantly increases the mapping rate of RNA-seq data from a lettuce time-series experiment (mock- and Botrytis cinerea-inoculated) and enables detection of genes that are differentially alternatively spliced in response to infection as well as transcript-specific expression changes. LsRTDv1 is a valuable resource for investigation of transcriptional and alternative splicing regulation in lettuce.

20.
Biostatistics ; 25(2): 559-576, 2024 Apr 15.
Article in English | MEDLINE | ID: mdl-37040757

ABSTRACT

Differential transcript usage (DTU) occurs when the relative expression of multiple transcripts arising from the same gene changes between different conditions. Existing approaches to detect DTU often rely on computational procedures that can have speed and scalability issues as the number of samples increases. Here we propose a new method, CompDTU, that uses compositional regression to model the relative abundance proportions of each transcript that are of interest in DTU analyses. This procedure leverages fast matrix-based computations that make it ideally suited for DTU analysis with larger sample sizes. This method also allows for the testing of and adjustment for multiple categorical or continuous covariates. Additionally, many existing approaches for DTU ignore quantification uncertainty in the expression estimates for each transcript in RNA-seq data. We extend our CompDTU method to incorporate quantification uncertainty leveraging common output from RNA-seq expression quantification tool in a novel method CompDTUme. Through several power analyses, we show that CompDTU has excellent sensitivity and reduces false positive results relative to existing methods. Additionally, CompDTUme results in further improvements in performance over CompDTU with sufficient sample size for genes with high levels of quantification uncertainty, while also maintaining favorable speed and scalability. We motivate our methods using data from the Cancer Genome Atlas Breast Invasive Carcinoma data set, specifically using RNA-seq data from primary tumors for 740 patients with breast cancer. We show greatly reduced computation time from our new methods as well as the ability to detect several novel genes with significant DTU across different breast cancer subtypes.


Subject(s)
Breast Neoplasms , Gene Expression Profiling , Humans , Female , Uncertainty , Sequence Analysis, RNA/methods , Genome , Breast Neoplasms/genetics
SELECTION OF CITATIONS
SEARCH DETAIL