RESUMO
Dysregulation of the DNA/RNA-binding protein FUS causes certain subtypes of ALS/FTD by largely unknown mechanisms. Recent evidence has shown that FUS toxic gain of function due either to mutations or to increased expression can disrupt critical cellular processes, including mitochondrial functions. Here, we demonstrate that in human cells overexpressing wild-type FUS or expressing mutant derivatives, the protein associates with multiple mRNAs, and these are enriched in mRNAs encoding mitochondrial respiratory chain components. Notably, this sequestration leads to reduced levels of the encoded proteins, which is sufficient to bring about disorganized mitochondrial networks, reduced aerobic respiration and increased reactive oxygen species. We further show that mutant FUS associates with mitochondria and with mRNAs encoded by the mitochondrial genome. Importantly, similar results were also observed in fibroblasts derived from ALS patients with FUS mutations. Finally, we demonstrate that FUS loss of function does not underlie the observed mitochondrial dysfunction, and also provides a mechanism for the preferential sequestration of the respiratory chain complex mRNAs by FUS that does not involve sequence-specific binding. Together, our data reveal that respiratory chain complex mRNA sequestration underlies the mitochondrial defects characteristic of ALS/FTD and contributes to the FUS toxic gain of function linked to this disease spectrum.
Assuntos
Esclerose Lateral Amiotrófica/genética , Esclerose Lateral Amiotrófica/fisiopatologia , Regulação da Expressão Gênica/genética , Mitocôndrias/patologia , RNA Mensageiro/metabolismo , Proteína FUS de Ligação a RNA/genética , Proteína FUS de Ligação a RNA/metabolismo , Linhagem Celular , Respiração Celular/genética , Células Cultivadas , Transporte de Elétrons/genética , Genoma Mitocondrial , Humanos , Mitocôndrias/genética , Mutação , Agregação Patológica de Proteínas/genética , Ligação Proteica/genéticaRESUMO
mRNA-based vaccines and therapeutics are gaining popularity and usage across a wide range of conditions. One of the critical issues when designing such mRNAs is sequence optimization. Even small proteins or peptides can be encoded by an enormously large number of mRNAs. The actual mRNA sequence can have a large impact on several properties, including expression, stability, immunogenicity, and more. To enable the selection of an optimal sequence, we developed CodonBERT, a large language model (LLM) for mRNAs. Unlike prior models, CodonBERT uses codons as inputs, which enables it to learn better representations. CodonBERT was trained using more than 10 million mRNA sequences from a diverse set of organisms. The resulting model captures important biological concepts. CodonBERT can also be extended to perform prediction tasks for various mRNA properties. CodonBERT outperforms previous mRNA prediction methods, including on a new flu vaccine data set.
Assuntos
RNA Mensageiro , Vacinas de mRNA , Humanos , RNA Mensageiro/genética , Códon , AlgoritmosRESUMO
Alternative polyadenylation (APA) produces mRNA isoforms with different 3' UTR lengths. Previous studies indicated that 3' end processing and mRNA export are intertwined in gene regulation. Here, we show that mRNA export factors generally facilitate usage of distal cleavage and polyadenylation sites (PASs), leading to long 3' UTR isoform expression. By focusing on the export receptor NXF1, which exhibits the most potent effect on APA in this study, we reveal several gene features that impact NXF1-dependent APA, including 3' UTR size, gene size, and AT content. Surprisingly, NXF1 downregulation results in RNA polymerase II (Pol II) accumulation at the 3' end of genes, correlating with its role in APA regulation. Moreover, NXF1 cooperates with CFI-68 to facilitate nuclear export of long 3' UTR isoform with UGUA motifs. Together, our work reveals important roles of NXF1 in coordinating transcriptional dynamics, 3' end processing, and nuclear export of long 3' UTR transcripts, implicating NXF1 as a nexus of gene regulation.
Assuntos
Núcleo Celular/metabolismo , Proteínas de Transporte Nucleocitoplasmático/metabolismo , Poliadenilação , RNA Mensageiro/biossíntese , Proteínas de Ligação a RNA/metabolismo , Transcrição Gênica , Regiões 3' não Traduzidas , Transporte Ativo do Núcleo Celular , Sítios de Ligação , Núcleo Celular/genética , Células HEK293 , Células HeLa , Humanos , Cinética , Proteínas de Transporte Nucleocitoplasmático/genética , Ligação Proteica , RNA Polimerase II/metabolismo , RNA Mensageiro/genética , Proteínas de Ligação a RNA/genéticaRESUMO
Transcription of eukaryotic protein-coding genes generates immature mRNAs that are subjected to a series of processing events, including capping, splicing, cleavage, and polyadenylation (CPA), and chemical modifications of bases. Alternative polyadenylation (APA) greatly contributes to mRNA diversity in the cell. By determining the length of the 3' untranslated region, APA generates transcripts with different regulatory elements, such as miRNA and RBP binding sites, which can influence mRNA stability, turnover, and translation. In the model plant Arabidopsis thaliana, APA is involved in the control of seed dormancy and flowering. In view of the physiological importance of APA in plants, we decided to investigate the effects of light/dark conditions and compare the underlying mechanisms to those elucidated for alternative splicing (AS). We found that light controls APA in approximately 30% of Arabidopsis genes. Similar to AS, the effect of light on APA requires functional chloroplasts, is not affected in mutants of the phytochrome and cryptochrome photoreceptor pathways, and is observed in roots only when the communication with the photosynthetic tissues is not interrupted. Furthermore, mitochondrial and TOR kinase activities are necessary for the effect of light. However, unlike AS, coupling with transcriptional elongation does not seem to be involved since light-dependent APA regulation is neither abolished in mutants of the TFIIS transcript elongation factor nor universally affected by chromatin relaxation caused by histone deacetylase inhibition. Instead, regulation seems to correlate with changes in the abundance of constitutive CPA factors, also mediated by the chloroplast.
Assuntos
Arabidopsis , Cloroplastos , Regulação da Expressão Gênica de Plantas , Luz , Poliadenilação , Arabidopsis/genética , Arabidopsis/metabolismo , Cloroplastos/metabolismo , Cloroplastos/genética , Processamento Alternativo , Proteínas de Arabidopsis/metabolismo , Proteínas de Arabidopsis/genética , RNA Mensageiro/genética , RNA Mensageiro/metabolismoRESUMO
Posttranscriptional regulation has emerged as a driver for leukemia development and an avenue for therapeutic targeting. Among posttranscriptional processes, alternative polyadenylation (APA) is globally dysregulated across cancer types. However, limited studies have focused on the prevalence and role of APA in myeloid leukemia. Furthermore, it is poorly understood how altered poly(A) site usage of individual genes contributes to malignancy or whether targeting global APA patterns might alter oncogenic potential. In this study, we examined global APA dysregulation in patients with acute myeloid leukemia (AML) by performing 3' region extraction and deep sequencing (3'READS) on a subset of AML patient samples along with healthy hematopoietic stem and progenitor cells (HSPCs) and by analyzing publicly available data from a broad AML patient cohort. We show that patient cells exhibit global 3' untranslated region (UTR) shortening and coding sequence lengthening due to differences in poly(A) site (PAS) usage. Among APA regulators, expression of FIP1L1, one of the core cleavage and polyadenylation factors, correlated with the degree of APA dysregulation in our 3'READS data set. Targeting global APA by FIP1L1 knockdown reversed the global trends seen in patients. Importantly, FIP1L1 knockdown induced differentiation of t(8;21) cells by promoting 3'UTR lengthening and downregulation of the fusion oncoprotein AML1-ETO. In non-t(8;21) cells, FIP1L1 knockdown also promoted differentiation by attenuating mechanistic target of rapamycin complex 1 (mTORC1) signaling and reducing MYC protein levels. Our study provides mechanistic insights into the role of APA in AML pathogenesis and indicates that targeting global APA patterns can overcome the differentiation block in patients with AML.
Assuntos
Regulação Leucêmica da Expressão Gênica , Leucemia Mieloide Aguda/genética , Poliadenilação , Regiões 3' não Traduzidas , Células Cultivadas , Células-Tronco Hematopoéticas/metabolismo , Humanos , Células Tumorais Cultivadas , Fatores de Poliadenilação e Clivagem de mRNA/genéticaRESUMO
Prostate cancer (PC) relies on androgen receptor (AR) signaling. While hormonal therapy (HT) is efficacious, most patients evolve to an incurable castration-resistant stage (CRPC). To date, most proposed mechanisms of acquired resistance to HT have focused on AR transcriptional activity. Herein, we uncover a new role for the AR in alternative cleavage and polyadenylation (APA). Inhibition of the AR by Enzalutamide globally regulates APA in PC cells, with specific enrichment in genes related to transcription and DNA topology, suggesting their involvement in transcriptome reprogramming. AR inhibition selects promoter-distal polyadenylation sites (pAs) enriched in cis-elements recognized by the cleavage and polyadenylation specificity factor (CPSF) complex. Conversely, promoter-proximal intronic pAs relying on the cleavage stimulation factor (CSTF) complex are repressed. Mechanistically, Enzalutamide induces rearrangement of APA subcomplexes and impairs the interaction between CPSF and CSTF. AR inhibition also induces co-transcriptional CPSF recruitment to gene promoters, predisposing the selection of pAs depending on this complex. Importantly, the scaffold CPSF160 protein is up-regulated in CRPC cells and its depletion represses HT-induced APA patterns. These findings uncover an unexpected role for the AR in APA regulation and suggest that APA-mediated transcriptome reprogramming represents an adaptive response of PC cells to HT.
Assuntos
Neoplasias de Próstata Resistentes à Castração , Receptores Androgênicos , Benzamidas , Linhagem Celular Tumoral , Proliferação de Células , Fator de Especificidade de Clivagem e Poliadenilação/genética , Fator de Especificidade de Clivagem e Poliadenilação/metabolismo , Fator Estimulador de Clivagem/metabolismo , Humanos , Masculino , Nitrilas , Feniltioidantoína , Poliadenilação , Neoplasias de Próstata Resistentes à Castração/genética , Neoplasias de Próstata Resistentes à Castração/metabolismo , Receptores Androgênicos/genética , Receptores Androgênicos/metabolismoRESUMO
The RNA-binding protein ALYREF plays key roles in nuclear export and also 3'-end processing of polyadenylated mRNAs, but whether such regulation also extends to non-polyadenylated RNAs is unknown. Replication-dependent (RD)-histone mRNAs are not polyadenylated, but instead end in a stem-loop (SL) structure. Here, we demonstrate that ALYREF prevalently binds a region next to the SL on RD-histone mRNAs. SL-binding protein (SLBP) directly interacts with ALYREF and promotes its recruitment. ALYREF promotes histone pre-mRNA 3'-end processing by facilitating U7-snRNP recruitment through physical interaction with the U7-snRNP-specific component Lsm11. Furthermore, ALYREF, together with other components of the TREX complex, enhances histone mRNA export. Moreover, we show that 3'-end processing promotes ALYREF recruitment and histone mRNA export. Together, our results point to an important role of ALYREF in coordinating 3'-end processing and nuclear export of non-polyadenylated mRNAs.
Assuntos
Histonas/metabolismo , Proteínas Nucleares/metabolismo , Processamento Pós-Transcricional do RNA , Transporte de RNA , RNA Mensageiro/metabolismo , Proteínas de Ligação a RNA/metabolismo , Ribonucleoproteína Nuclear Pequena U7/metabolismo , Fatores de Transcrição/metabolismo , Transporte Ativo do Núcleo Celular , Exodesoxirribonucleases/genética , Exodesoxirribonucleases/metabolismo , Histonas/genética , Humanos , Proteínas Nucleares/genética , Proteínas de Transporte Nucleocitoplasmático/genética , Proteínas de Transporte Nucleocitoplasmático/metabolismo , Fosfoproteínas/genética , Fosfoproteínas/metabolismo , RNA Mensageiro/genética , Proteínas de Ligação a RNA/genética , Ribonucleoproteína Nuclear Pequena U7/genética , Fatores de Transcrição/genéticaRESUMO
Staufen1 (STAU1) is an RNA-binding protein (RBP) that interacts with double-stranded RNA structures and has been implicated in regulating different aspects of mRNA metabolism. Previous studies have indicated that STAU1 interacts extensively with RNA structures in coding regions (CDSs) and 3'-untranslated regions (3'UTRs). In particular, duplex structures formed within 3'UTRs by inverted-repeat Alu elements (IRAlus) interact with STAU1 through its double-stranded RNA-binding domains (dsRBDs). Using 3' region extraction and deep sequencing coupled to ribonucleoprotein immunoprecipitation (3'READS + RIP), together with reanalyzing previous STAU1 binding and RNA structure data, we delineate STAU1 interactions transcriptome-wide, including binding differences between alternative polyadenylation (APA) isoforms. Consistent with previous reports, RNA structures are dominant features for STAU1 binding to CDSs and 3'UTRs. Overall, relative to short 3'UTR counterparts, longer 3'UTR isoforms of genes have stronger STAU1 binding, most likely due to a higher frequency of RNA structures, including specific IRAlus sequences. Nevertheless, a sizable fraction of genes express transcripts showing the opposite trend, attributable to AU-rich sequences in their alternative 3'UTRs that may recruit antagonistic RBPs and/or destabilize RNA structures. Using STAU1-knockout cells, we show that strong STAU1 binding to mRNA 3'UTRs generally enhances polysome association. However, IRAlus generally have little impact on STAU1-mediated polysome association despite having strong interactions with the protein. Taken together, our work reveals complex interactions of STAU1 with its cognate RNA substrates. Our data also shed light on distinct post-transcriptional fates for the widespread APA isoforms in mammalian cells.
Assuntos
Proteínas do Citoesqueleto/química , Proteínas do Citoesqueleto/metabolismo , Polirribossomos/metabolismo , RNA Mensageiro/química , RNA Mensageiro/metabolismo , Proteínas de Ligação a RNA/química , Proteínas de Ligação a RNA/metabolismo , Regiões 3' não Traduzidas , Processamento Alternativo , Elementos Alu , Proteínas do Citoesqueleto/genética , Perfilação da Expressão Gênica , Técnicas de Inativação de Genes , Células HEK293 , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Imunoprecipitação , Conformação Molecular , Motivos de Ligação ao RNA , Proteínas de Ligação a RNA/genéticaRESUMO
Cleavage and polyadenylation is essential for 3' end processing of almost all eukaryotic mRNAs. Recent studies have shown widespread alternative cleavage and polyadenylation (APA) events leading to mRNA isoforms with different 3' UTRs and/or coding sequences. Here, we present a compendium of conserved cleavage and polyadenylation sites (PASs) in mammalian genes, based on approximately 1.2 billion 3' end sequencing reads from more than 360 human, mouse, and rat samples. We show that â¼80% of mammalian mRNA genes contain at least one conserved PAS, and â¼50% have conserved APA events. PAS conservation generally reduces promiscuous 3' end processing, stabilizing gene expression levels across species. Conservation of APA correlates with gene age, gene expression features, and gene functions. Genes with certain functions, such as cell morphology, cell proliferation, and mRNA metabolism, are particularly enriched with conserved APA events. Whereas tissue-specific genes typically have a low APA rate, brain-specific genes tend to evolve APA. In addition, we show enrichment of mRNA destabilizing motifs in alternative 3' UTR sequences, leading to substantial differences in mRNA stability between 3' UTR isoforms. Using conserved PASs, we reveal sequence motifs surrounding APA sites and a preference of adenosine at the cleavage site. Furthermore, we show that mutations of U-rich motifs around the PAS often accompany APA profile differences between species. Analysis of lncRNA PASs indicates a mechanism of PAS fixation through evolution of A-rich motifs. Taken together, our results present a comprehensive view of PAS evolution in mammals, and a phylogenic perspective on APA functions.
Assuntos
RNA Mensageiro/química , RNA Mensageiro/genética , Análise de Sequência de RNA/métodos , Regiões 3' não Traduzidas , Animais , Sequência Conservada , Evolução Molecular , Regulação da Expressão Gênica , Humanos , Camundongos , Mutação , Especificidade de Órgãos , Filogenia , Poliadenilação , Estabilidade de RNA , RNA Longo não Codificante/química , RNA Longo não Codificante/genética , RNA Longo não Codificante/metabolismo , RNA Mensageiro/metabolismo , Ratos , Especificidade da EspécieRESUMO
PolyA_DB is a database cataloging cleavage and polyadenylation sites (PASs) in several genomes. Previous versions were based mainly on expressed sequence tags (ESTs), which had a limited amount and could lead to inaccurate PAS identification due to the presence of internal A-rich sequences in transcripts. Here, we present an updated version of the database based solely on deep sequencing data. First, PASs are mapped by the 3' region extraction and deep sequencing (3'READS) method, ensuring unequivocal PAS identification. Second, a large volume of data based on diverse biological samples increases PAS coverage by 3.5-fold over the EST-based version and provides PAS usage information. Third, strand-specific RNA-seq data are used to extend annotated 3' ends of genes to obtain more thorough annotations of alternative polyadenylation (APA) sites. Fourth, conservation information of PAS across mammals sheds light on significance of APA sites. The database (URL: http://www.polya-db.org/v3) currently holds PASs in human, mouse, rat and chicken, and has links to the UCSC genome browser for further visualization and for integration with other genomic data.
Assuntos
Bases de Dados Genéticas , Sequenciamento de Nucleotídeos em Larga Escala , Poliadenilação , Análise de Sequência de RNA , Animais , Galinhas/genética , Genoma , Humanos , Camundongos , Clivagem do RNA , Ratos , Interface Usuário-ComputadorRESUMO
Oculopharyngeal muscular dystrophy (OPMD) is a late onset disease caused by polyalanine expansion in the poly(A) binding protein nuclear 1 (PABPN1). Several mouse models have been generated to study OPMD; however, most of these models have employed transgenic overexpression of alanine-expanded PABPN1. These models do not recapitulate the OPMD patient genotype and PABPN1 overexpression could confound molecular phenotypes. We have developed a knock-in mouse model of OPMD (Pabpn1+/A17) that contains one alanine-expanded Pabpn1 allele under the control of the native promoter and one wild-type Pabpn1 allele. This mouse is the closest available genocopy of OPMD patients. We show that Pabpn1+/A17 mice have a mild myopathic phenotype in adult and aged animals. We examined early molecular and biochemical phenotypes associated with expressing native levels of A17-PABPN1 and detected shorter poly(A) tails, modest changes in poly(A) signal (PAS) usage, and evidence of mitochondrial damage in these mice. Recent studies have suggested that a loss of PABPN1 function could contribute to muscle pathology in OPMD. To investigate a loss of function model of pathology, we generated a heterozygous Pabpn1 knock-out mouse model (Pabpn1+/Δ). Like the Pabpn1+/A17 mice, Pabpn1+/Δ mice have mild histologic defects, shorter poly(A) tails, and evidence of mitochondrial damage. However, the phenotypes detected in Pabpn1+/Δ mice only partially overlap with those detected in Pabpn1+/A17 mice. These results suggest that loss of PABPN1 function could contribute to but may not completely explain the pathology detected in Pabpn1+/A17 mice.
Assuntos
Distrofia Muscular Oculofaríngea/genética , Distrofia Muscular Oculofaríngea/metabolismo , Proteína I de Ligação a Poli(A)/genética , Proteína I de Ligação a Poli(A)/metabolismo , Animais , Modelos Animais de Doenças , Técnicas de Introdução de Genes , Genótipo , Camundongos , Camundongos Knockout , Mitocôndrias/metabolismo , Músculo Esquelético/metabolismo , Distrofia Muscular Oculofaríngea/patologia , Peptídeos , FenótipoRESUMO
Alternative polyadenylation (APA) is a mechanism that generates multiple mRNA isoforms with different 3'UTRs and/or coding sequences from a single gene. Here, using 3' region extraction and deep sequencing (3'READS), we have systematically mapped cleavage and polyadenylation sites (PASs) in Drosophila melanogaster, expanding the total repertoire of PASs previously identified for the species, especially those located in A-rich genomic sequences. Cis-element analysis revealed distinct sequence motifs around fly PASs when compared to mammalian ones, including the greater enrichment of upstream UAUA elements and the less prominent presence of downstream UGUG elements. We found that over 75% of mRNA genes in Drosophila melanogaster undergo APA. The head tissue tends to use distal PASs when compared to the body, leading to preferential expression of APA isoforms with long 3'UTRs as well as with distal terminal exons. The distance between the APA sites and intron location of PAS are important parameters for APA difference between body and head, suggesting distinct PAS selection contexts. APA analysis of the RpII215C4 mutant strain, which harbors a mutant RNA polymerase II (RNAPII) with a slower elongation rate, revealed that a 50% decrease in transcriptional elongation rate leads to a mild trend of more usage of proximal, weaker PASs, both in 3'UTRs and in introns, consistent with the "first come, first served" model of APA regulation. However, this trend was not observed in the head, suggesting a different regulatory context in neuronal cells. Together, our data expand the PAS collection for Drosophila melanogaster and reveal a tissue-specific effect of APA regulation by RNAPII elongation rate.
Assuntos
Processamento Alternativo , Animais Geneticamente Modificados/genética , Drosophila melanogaster/genética , Regulação Fúngica da Expressão Gênica , Poliadenilação , RNA Polimerase II/metabolismo , Elongação da Transcrição Genética , Regiões 3' não Traduzidas/genética , Animais , Animais Geneticamente Modificados/crescimento & desenvolvimento , Animais Geneticamente Modificados/metabolismo , Drosophila melanogaster/crescimento & desenvolvimento , Drosophila melanogaster/metabolismo , Sequenciamento de Nucleotídeos em Larga Escala , Masculino , RNA Polimerase II/genéticaRESUMO
Global investigation of poly(A) tails has been hindered by technical challenges. In a recent advance, two groups developed deep sequencing methods to globally interrogate poly(A) tail length and sequence with high precision, opening new avenues for investigation of poly(A) tail functions in mRNA metabolism. Initial applications of these methods reveal insights into the relationship between poly(A) tail length and translational efficiency, and identify widespread uridylation and guanylation at the 3' ends of transcripts.
Assuntos
Regiões 3' não Traduzidas/genética , Sequenciamento de Nucleotídeos em Larga Escala , Poli A/genética , Processamento Pós-Transcricional do RNA , RNA Mensageiro/genética , Animais , HumanosRESUMO
Sequencing of the 3' end of poly(A)(+) RNA identifies cleavage and polyadenylation sites (pAs) and measures transcript expression. We previously developed a method, 3' region extraction and deep sequencing (3'READS), to address mispriming issues that often plague 3' end sequencing. Here we report a new version, named 3'READS+, which has vastly improved accuracy and sensitivity. Using a special locked nucleic acid oligo to capture poly(A)(+) RNA and to remove the bulk of the poly(A) tail, 3'READS+ generates RNA fragments with an optimal number of terminal A's that balance data quality and detection of genuine pAs. With improved RNA ligation steps for efficiency, the method shows much higher sensitivity (over two orders of magnitude) compared to the previous version. Using 3'READS+, we have uncovered a sizable fraction of previously overlooked pAs located next to or within a stretch of adenylate residues in human genes and more accurately assessed the frequency of alternative cleavage and polyadenylation (APA) in HeLa cells (â¼50%). 3'READS+ will be a useful tool to accurately study APA and to analyze gene expression by 3' end counting, especially when the amount of input total RNA is limited.
Assuntos
Regiões 3' não Traduzidas , Sequenciamento de Nucleotídeos em Larga Escala/métodos , RNA Mensageiro/genética , Análise de Sequência de RNA/métodos , Células HeLa , Humanos , Sensibilidade e EspecificidadeRESUMO
Alternative cleavage and polyadenylation (APA) results in mRNA isoforms containing different 3' untranslated regions (3'UTRs) and/or coding sequences. How core cleavage/polyadenylation (C/P) factors regulate APA is not well understood. Using siRNA knockdown coupled with deep sequencing, we found that several C/P factors can play significant roles in 3'UTR-APA. Whereas Pcf11 and Fip1 enhance usage of proximal poly(A) sites (pAs), CFI-25/68, PABPN1 and PABPC1 promote usage of distal pAs. Strong cis element biases were found for pAs regulated by CFI-25/68 or Fip1, and the distance between pAs plays an important role in APA regulation. In addition, intronic pAs are substantially regulated by splicing factors, with U1 mostly inhibiting C/P events in introns near the 5' end of gene and U2 suppressing those in introns with features for efficient splicing. Furthermore, PABPN1 inhibits expression of transcripts with pAs near the transcription start site (TSS), a property possibly related to its role in RNA degradation. Finally, we found that groups of APA events regulated by C/P factors are also modulated in cell differentiation and development with distinct trends. Together, our results support an APA code where an APA event in a given cellular context is regulated by a number of parameters, including relative location to the TSS, splicing context, distance between competing pAs, surrounding cis elements and concentrations of core C/P factors.
Assuntos
Diferenciação Celular/genética , Proteína I de Ligação a Poli(A)/genética , Poliadenilação/genética , Splicing de RNA/genética , Regiões 3' não Traduzidas/genética , Éxons , Regulação da Expressão Gênica , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Íntrons/genética , Proteína I de Ligação a Poli(A)/biossíntese , Estabilidade de RNA/genética , RNA Mensageiro/genéticaRESUMO
BACKGROUND: Most mammalian genes display alternative cleavage and polyadenylation (APA). Previous studies have indicated preferential expression of APA isoforms with short 3' untranslated regions (3'UTRs) in testes. RESULTS: By deep sequencing of the 3' end region of poly(A) + transcripts, we report widespread shortening of 3'UTR through APA during the first wave of spermatogenesis in mouse, with 3'UTR size being the shortest in spermatids. Using genes without APA as a control, we show that shortening of 3'UTR eliminates destabilizing elements, such as U-rich elements and transposable elements, which appear highly potent during spermatogenesis. We additionally found widespread regulation of APA events in introns and exons that can affect the coding sequence of transcripts and global activation of antisense transcripts upstream of the transcription start site, suggesting modulation of splicing and initiation of transcription during spermatogenesis. Importantly, genes that display significant 3'UTR shortening tend to have functions critical for further sperm maturation, and testis-specific genes display greater 3'UTR shortening than ubiquitously expressed ones, indicating functional relevance of APA to spermatogenesis. Interestingly, genes with shortened 3'UTRs tend to have higher RNA polymerase II and H3K4me3 levels in spermatids as compared to spermatocytes, features previously known to be associated with open chromatin state. CONCLUSIONS: Our data suggest that open chromatin may create a favorable cis environment for 3' end processing, leading to global shortening of 3'UTR during spermatogenesis. mRNAs with shortened 3'UTRs are relatively stable thanks to evasion of powerful mRNA degradation mechanisms acting on 3'UTR elements. Stable mRNAs generated in spermatids may be important for protein production at later stages of sperm maturation, when transcription is globally halted.
Assuntos
Regiões 3' não Traduzidas , Cromatina/genética , Regulação da Expressão Gênica , Poliadenilação , RNA Mensageiro/genética , Espermatogênese , Animais , Cromatina/química , Elementos de DNA Transponíveis , Masculino , Camundongos , Estabilidade de RNA , RNA Mensageiro/química , Transcrição Gênica , TranscriptomaRESUMO
Alternative cleavage and polyadenylation (APA) generates diverse mRNA isoforms. We developed 3' region extraction and deep sequencing (3'READS) to address mispriming issues that commonly plague poly(A) site (pA) identification, and we used the method to comprehensively map pAs in the mouse genome. Thorough annotation of gene 3' ends revealed over 5,000 previously overlooked pAs (â¼8% of total) flanked by A-rich sequences, underscoring the necessity of using an accurate tool for pA mapping. About 79% of mRNA genes and 66% of long noncoding RNA genes undergo APA, but these two gene types have distinct usage patterns for pAs in introns and upstream exons. Quantitative analysis of APA isoforms by 3'READS indicated that promoter-distal pAs, regardless of intron or exon locations, become more abundant during embryonic development and cell differentiation and that upregulated isoforms have stronger pAs, suggesting global modulation of the 3' end-processing activity in development and differentiation.
Assuntos
Regiões 3' não Traduzidas/genética , Sequenciamento de Nucleotídeos em Larga Escala , Poliadenilação , Animais , Camundongos , RNA Longo não Codificante , RNA Mensageiro/genéticaRESUMO
Almost all eukaryotic pre-mRNAs are processed at the 3' end by the cleavage and polyadenylation (C/P) reaction, which preludes termination of transcription and gives rise to the poly(A) tail of mature mRNA. Genomic studies in recent years have indicated that most eukaryotic mRNA genes have multiple cleavage and polyadenylation sites (pAs), leading to alternative cleavage and polyadenylation (APA) products. APA isoforms generally differ in their 3' untranslated regions (3' UTRs), but can also have different coding sequences (CDSs). APA expands the repertoire of transcripts expressed from the genome, and is highly regulated under various physiological and pathological conditions. Growing lines of evidence have shown that RNA-binding proteins (RBPs) play important roles in regulation of APA. Some RBPs are part of the machinery for C/P; others influence pA choice through binding to adjacent regions. In this chapter, we review cis elements and trans factors involved in C/P, the significance of APA, and increasingly elucidated roles of RBPs in APA regulation. We also discuss analysis of APA using transcriptome-wide techniques as well as molecular biology approaches.
Assuntos
Regiões 3' não Traduzidas/fisiologia , Regulação da Expressão Gênica/fisiologia , Genoma Humano/fisiologia , Poliadenilação/fisiologia , Proteínas de Ligação a RNA , Animais , Humanos , Proteínas de Ligação a RNA/genética , Proteínas de Ligação a RNA/metabolismoRESUMO
The degree to which translational control is specified by mRNA sequence is poorly understood in mammalian cells. Here, we constructed and leveraged a compendium of 3,819 ribosomal profiling datasets, distilling them into a transcriptome-wide atlas of translation efficiency (TE) measurements encompassing >140 human and mouse cell types. We subsequently developed RiboNN, a multitask deep convolutional neural network, and classic machine learning models to predict TEs in hundreds of cell types from sequence-encoded mRNA features, achieving state-of-the-art performance (r=0.79 in human and r=0.78 in mouse for mean TE across cell types). While the majority of earlier models solely considered 5' UTR sequence, RiboNN integrates contributions from the full-length mRNA sequence, learning that the 5' UTR, CDS, and 3' UTR respectively possess ~67%, 31%, and 2% per-nucleotide information density in the specification of mammalian TEs. Interpretation of RiboNN revealed that the spatial positioning of low-level di- and tri-nucleotide features (i.e., including codons) largely explain model performance, capturing mechanistic principles such as how ribosomal processivity and tRNA abundance control translational output. RiboNN is predictive of the translational behavior of base-modified therapeutic RNA, and can explain evolutionary selection pressures in human 5' UTRs. Finally, it detects a common language governing mRNA regulatory control and highlights the interconnectedness of mRNA translation, stability, and localization in mammalian organisms.
RESUMO
Characterization of shared patterns of RNA expression between genes across conditions has led to the discovery of regulatory networks and novel biological functions. However, it is unclear if such coordination extends to translation, a critical step in gene expression. Here, we uniformly analyzed 3,819 ribosome profiling datasets from 117 human and 94 mouse tissues and cell lines. We introduce the concept of Translation Efficiency Covariation (TEC), identifying coordinated translation patterns across cell types. We nominate potential mechanisms driving shared patterns of translation regulation. TEC is conserved across human and mouse cells and helps uncover gene functions. Moreover, our observations indicate that proteins that physically interact are highly enriched for positive covariation at both translational and transcriptional levels. Our findings establish translational covariation as a conserved organizing principle of mammalian transcriptomes.