RESUMEN
Translation regulation is critical for early mammalian embryonic development1. However, previous studies had been restricted to bulk measurements2, precluding precise determination of translation regulation including allele-specific analyses. Here, to address this challenge, we developed a novel microfluidic isotachophoresis (ITP) approach, named RIBOsome profiling via ITP (Ribo-ITP), and characterized translation in single oocytes and embryos during early mouse development. We identified differential translation efficiency as a key mechanism regulating genes involved in centrosome organization and N6-methyladenosine modification of RNAs. Our high-coverage measurements enabled, to our knowledge, the first analysis of allele-specific ribosome engagement in early development. These led to the discovery of stage-specific differential engagement of zygotic RNAs with ribosomes and reduced translation efficiency of transcripts exhibiting allele-biased expression. By integrating our measurements with proteomics data, we discovered that ribosome occupancy in germinal vesicle-stage oocytes is the predominant determinant of protein abundance in the zygote. The Ribo-ITP approach will enable numerous applications by providing high-coverage and high-resolution ribosome occupancy measurements from ultra-low input samples including single cells.
Asunto(s)
Desarrollo Embrionario , Isotacoforesis , Técnicas Analíticas Microfluídicas , Biosíntesis de Proteínas , Perfilado de Ribosomas , Ribosomas , Análisis de la Célula Individual , Animales , Ratones , Proteómica , Ribosomas/metabolismo , ARN Mensajero/genética , Análisis de la Célula Individual/métodos , Alelos , Técnicas Analíticas Microfluídicas/métodos , Oocitos/crecimiento & desarrollo , Oocitos/metabolismo , Isotacoforesis/métodos , Perfilado de Ribosomas/métodos , Centrosoma , Cigoto/crecimiento & desarrollo , Cigoto/metabolismoRESUMEN
Long-read sequencing technology enables highly accurate detection of allele-specific RNA expression, providing insights into the effects of genetic variation on splicing and RNA abundance. Furthermore, the ability to directly sequence RNA using the Oxford Nanopore technology promises the detection of RNA modifications in tandem with ascertaining the allelic origin of each molecule. Here, we leverage these advantages to determine allele-biased patterns of N6-methyladenosine (m6A) modifications in native mRNA. We utilized human and mouse cells with known genetic variants to assign allelic origin of each mRNA molecule combined with a supervised machine learning model to detect read-level m6A modification ratios. Our analyses revealed the importance of sequences adjacent to the DRACH-motif in determining m6A deposition, in addition to allelic differences that directly alter the motif. Moreover, we discovered allele-specific m6A modification (ASM) events with no genetic variants in close proximity to the differentially modified nucleotide, demonstrating the unique advantage of using long reads and surpassing the capabilities of antibody-based short-read approaches. This technological advancement promises to advance our understanding of the role of genetics in determining mRNA modifications.
RESUMEN
In addition to sculpting eukaryotic transcripts by removing introns, pre-mRNA splicing greatly impacts protein composition of the emerging mRNP. The exon junction complex (EJC), deposited upstream of exon-exon junctions after splicing, is a major constituent of spliced mRNPs. Here, we report comprehensive analysis of the endogenous human EJC protein and RNA interactomes. We confirm that the major "canonical" EJC occupancy site in vivo lies 24 nucleotides upstream of exon junctions and that the majority of exon junctions carry an EJC. Unexpectedly, we find that endogenous EJCs multimerize with one another and with numerous SR proteins to form megadalton sized complexes in which SR proteins are super-stoichiometric to EJC core factors. This tight physical association may explain known functional parallels between EJCs and SR proteins. Further, their protection of long mRNA stretches from nuclease digestion suggests that endogenous EJCs and SR proteins cooperate to promote mRNA packaging and compaction.
Asunto(s)
Exones , Proteoma/análisis , Procesamiento Postranscripcional del ARN , Ribonucleoproteínas/química , Ribonucleoproteínas/metabolismo , Humanos , Complejos Multiproteicos/análisis , Precursores del ARN/metabolismo , Empalme del ARNRESUMEN
MOTIVATION: Ribosome profiling is a widely-used technique for measuring ribosome occupancy at nucleotide resolution. However, the need to analyze this data at nucleotide resolution introduces unique challenges in data visualization and analyses. RESULTS: In this study, we introduce RiboGraph, a dedicated visualization tool designed to work with .ribo files, a specialized and efficient format for ribosome occupancy data. Unlike existing solutions that rely on large alignment files and time-consuming preprocessing steps, RiboGraph operates on a purpose designed compact file type. This efficiency allows for interactive, real-time visualization at ribosome-protected fragment length resolution. By providing an integrated toolset, RiboGraph empowers researchers to conduct comprehensive visual analysis of ribosome occupancy data. AVAILABILITY AND IMPLEMENTATION: Source code, step-by-step installation instructions and links to documentation are available on GitHub: https://github.com/ribosomeprofiling/ribograph. On the same page, we provide test files and a step-by-step tutorial highlighting the key features of RiboGraph.
Asunto(s)
Ribosomas , Programas Informáticos , Ribosomas/metabolismo , Biología Computacional/métodos , Perfilado de RibosomasRESUMEN
Multiplexed assays of variant effect are powerful methods to profile the consequences of rare variants on gene expression and organismal fitness. Yet, few studies have integrated several multiplexed assays to map variant effects on gene expression in coding sequences. Here, we pioneered a multiplexed assay based on polysome profiling to measure variant effects on translation at scale, uncovering single-nucleotide variants that increase or decrease ribosome load. By combining high-throughput ribosome load data with multiplexed mRNA and protein abundance readouts, we mapped the cis-regulatory landscape of thousands of catechol-O-methyltransferase (COMT) variants from RNA to protein and found numerous coding variants that alter COMT expression. Finally, we trained machine learning models to map signatures of variant effects on COMT gene expression and uncovered both directional and divergent impacts across expression layers. Our analyses reveal expression phenotypes for thousands of variants in COMT and highlight variant effects on both single and multiple layers of expression. Our findings prompt future studies that integrate several multiplexed assays for the readout of gene expression.
Asunto(s)
Catecol O-Metiltransferasa , Aprendizaje Automático , Polimorfismo de Nucleótido Simple , Catecol O-Metiltransferasa/genética , Catecol O-Metiltransferasa/metabolismo , Humanos , ARN Mensajero/genética , ARN Mensajero/metabolismo , Ribosomas/metabolismo , Ribosomas/genética , Biosíntesis de ProteínasRESUMEN
In light of the numerous studies identifying post-transcriptional regulators on the surface of the endoplasmic reticulum (ER), we asked whether there are factors that regulate compartment specific mRNA translation in human cells. Using a proteomic survey of spatially regulated polysome interacting proteins, we identified the glycolytic enzyme Pyruvate Kinase M (PKM) as a cytosolic (i.e. ER-excluded) polysome interactor and investigated how it influences mRNA translation. We discovered that the PKM-polysome interaction is directly regulated by ADP levels-providing a link between carbohydrate metabolism and mRNA translation. By performing enhanced crosslinking immunoprecipitation-sequencing (eCLIP-seq), we found that PKM crosslinks to mRNA sequences that are immediately downstream of regions that encode lysine- and glutamate-enriched tracts. Using ribosome footprint protection sequencing, we found that PKM binding to ribosomes causes translational stalling near lysine and glutamate encoding sequences. Lastly, we observed that PKM recruitment to polysomes is dependent on poly-ADP ribosylation activity (PARylation)-and may depend on co-translational PARylation of lysine and glutamate residues of nascent polypeptide chains. Overall, our study uncovers a novel role for PKM in post-transcriptional gene regulation, linking cellular metabolism and mRNA translation.
Asunto(s)
Poli ADP Ribosilación , Biosíntesis de Proteínas , Piruvato Quinasa , Humanos , Glutamatos/análisis , Glutamatos/genética , Glutamatos/metabolismo , Lisina/metabolismo , Proteómica , Piruvato Quinasa/genética , Piruvato Quinasa/análisis , Piruvato Quinasa/metabolismo , Ribosomas/metabolismoRESUMEN
Viruses rely on the host translation machinery to synthesize their own proteins. Consequently, they have evolved varied mechanisms to co-opt host translation for their survival. SARS-CoV-2 relies on a nonstructural protein, Nsp1, for shutting down host translation. However, it is currently unknown how viral proteins and host factors critical for viral replication can escape a global shutdown of host translation. Here, using a novel FACS-based assay called MeTAFlow, we report a dose-dependent reduction in both nascent protein synthesis and mRNA abundance in cells expressing Nsp1. We perform RNA-seq and matched ribosome profiling experiments to identify gene-specific changes both at the mRNA expression and translation levels. We discover that a functionally coherent subset of human genes is preferentially translated in the context of Nsp1 expression. These genes include the translation machinery components, RNA binding proteins, and others important for viral pathogenicity. Importantly, we uncovered a remarkable enrichment of 5' terminal oligo-pyrimidine (TOP) tracts among preferentially translated genes. Using reporter assays, we validated that 5' UTRs from TOP transcripts can drive preferential expression in the presence of Nsp1. Finally, we found that LARP1, a key effector protein in the mTOR pathway, may contribute to preferential translation of TOP transcripts in response to Nsp1 expression. Collectively, our study suggests fine-tuning of host gene expression and translation by Nsp1 despite its global repressive effect on host protein synthesis.
Asunto(s)
Interacciones Huésped-Patógeno/genética , Biosíntesis de Proteínas , Proteínas/química , Proteínas/genética , Proteínas no Estructurales Virales/genética , Regiones no Traducidas 5' , Autoantígenos/genética , Autoantígenos/metabolismo , Regulación de la Expresión Génica , Células HEK293 , Humanos , Pliegue de Proteína , Pirimidinas , ARN Mensajero/genética , Ribonucleoproteínas/genética , Ribonucleoproteínas/metabolismo , Ribosomas/genética , Ribosomas/virología , Serina-Treonina Quinasas TOR/genética , Serina-Treonina Quinasas TOR/metabolismo , Proteínas no Estructurales Virales/metabolismo , Antígeno SS-BRESUMEN
Non-canonical intronic variants are a poorly characterized yet highly prevalent class of alterations associated with Mendelian disorders. Here, we report the first RNA expression and splicing analysis from a family whose members carry a non-canonical splice variant in an intron of RPL11 (c.396 +3A>G). This mutation is causative for Diamond Blackfan Anemia (DBA) in this family despite incomplete penetrance and variable expressivity. Our analyses revealed a complex pattern of disruptions with many novel junctions of RPL11. These include an RPL11 transcript that is translated with a late stop codon in the 3' untranslated region (3'UTR) of the main isoform. We observed that RPL11 transcript abundance is comparable among carriers regardless of symptom severity. Interestingly, both the small and large ribosomal subunit transcripts were significantly overexpressed in individuals with a history of anemia in addition to congenital abnormalities. Finally, we discovered that coordinated expression between mitochondrial components and RPL11 was lost in all carriers, which may lead to variable expressivity. Overall, this study highlights the importance of RNA splicing and expression analyses in families for molecular characterization of Mendelian diseases.
Asunto(s)
Anemia de Diamond-Blackfan , Genes Mitocondriales , Proteínas Ribosómicas , Anemia de Diamond-Blackfan/genética , Humanos , Mutación , Empalme del ARN , Enfermedades Raras/genética , Proteínas Ribosómicas/genéticaRESUMEN
SUMMARY: Ribosome occupancy measurements enable protein abundance estimation and infer mechanisms of translation. Recent studies have revealed that sequence read lengths in ribosome profiling data are highly variable and carry critical information. Consequently, data analyses require the computation and storage of multiple metrics for a wide range of ribosome footprint lengths. We developed a software ecosystem including a new efficient binary file format named 'ribo'. Ribo files store all essential data grouped by ribosome footprint lengths. Users can assemble ribo files using our RiboFlow pipeline that processes raw ribosomal profiling sequencing data. RiboFlow is highly portable and customizable across a large number of computational environments with built-in capabilities for parallelization. We also developed interfaces for writing and reading ribo files in the R (RiboR) and Python (RiboPy) environments. Using RiboR and RiboPy, users can efficiently access ribosome profiling quality control metrics, generate essential plots and carry out analyses. Altogether, these components create a software ecosystem for researchers to study translation through ribosome profiling. AVAILABILITY AND IMPLEMENTATION: For a quickstart, please see https://ribosomeprofiling.github.io. Source code, installation instructions and links to documentation are available on GitHub: https://github.com/ribosomeprofiling. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Asunto(s)
Ecosistema , Ribosomas , Proteínas , Análisis de Secuencia , Programas InformáticosRESUMEN
Introns are found in 5' untranslated regions (5'UTRs) for 35% of all human transcripts. These 5'UTR introns are not randomly distributed: Genes that encode secreted, membrane-bound and mitochondrial proteins are less likely to have them. Curiously, transcripts lacking 5'UTR introns tend to harbor specific RNA sequence elements in their early coding regions. To model and understand the connection between coding-region sequence and 5'UTR intron status, we developed a classifier that can predict 5'UTR intron status with >80% accuracy using only sequence features in the early coding region. Thus, the classifier identifies transcripts with 5' proximal-intron-minus-like-coding regions ("5IM" transcripts). Unexpectedly, we found that the early coding sequence features defining 5IM transcripts are widespread, appearing in 21% of all human RefSeq transcripts. The 5IM class of transcripts is enriched for non-AUG start codons, more extensive secondary structure both preceding the start codon and near the 5' cap, greater dependence on eIF4E for translation, and association with ER-proximal ribosomes. 5IM transcripts are bound by the exon junction complex (EJC) at noncanonical 5' proximal positions. Finally, N1-methyladenosines are specifically enriched in the early coding regions of 5IM transcripts. Taken together, our analyses point to the existence of a distinct 5IM class comprising â¼20% of human transcripts. This class is defined by depletion of 5' proximal introns, presence of specific RNA sequence features associated with low translation efficiency, N1-methyladenosines in the early coding region, and enrichment for noncanonical binding by the EJC.
Asunto(s)
Regiones no Traducidas 5' , Adenosina/análogos & derivados , Secuencia de Bases , Intrones , Biosíntesis de Proteínas , Eliminación de Secuencia , Adenosina/genética , Adenosina/metabolismo , Codón Iniciador/química , Codón Iniciador/metabolismo , Factor 4E Eucariótico de Iniciación/genética , Factor 4E Eucariótico de Iniciación/metabolismo , Exones , Humanos , Sistemas de Lectura Abierta , Unión Proteica , Ribosomas/genética , Ribosomas/metabolismoRESUMEN
Elucidating the consequences of genetic differences between humans is essential for understanding phenotypic diversity and personalized medicine. Although variation in RNA levels, transcription factor binding, and chromatin have been explored, little is known about global variation in translation and its genetic determinants. We used ribosome profiling, RNA sequencing, and mass spectrometry to perform an integrated analysis in lymphoblastoid cell lines from a diverse group of individuals. We find significant differences in RNA, translation, and protein levels suggesting diverse mechanisms of personalized gene expression control. Combined analysis of RNA expression and ribosome occupancy improves the identification of individual protein level differences. Finally, we identify genetic differences that specifically modulate ribosome occupancy--many of these differences lie close to start codons and upstream ORFs. Our results reveal a new level of gene expression variation among humans and indicate that genetic variants can cause changes in protein levels through effects on translation.
Asunto(s)
Polimorfismo de Nucleótido Simple , Biosíntesis de Proteínas , ARN/metabolismo , Cromatina/genética , Cromatina/metabolismo , Perfilación de la Expresión Génica , Regulación de la Expresión Génica , Humanos , Proteómica , Sitios de Carácter Cuantitativo , ARN Mensajero/genética , ARN Mensajero/metabolismo , Ribosomas/genética , Ribosomas/metabolismo , Alineación de Secuencia , Análisis de Secuencia de ARNRESUMEN
Deep sequencing of strand-specific cDNA libraries is now a ubiquitous tool for identifying and quantifying RNAs in diverse sample types. The accuracy of conclusions drawn from these analyses depends on precise and quantitative conversion of the RNA sample into a DNA library suitable for sequencing. Here, we describe an optimized method of preparing strand-specific RNA deep sequencing libraries from small RNAs and variably sized RNA fragments obtained from ribonucleoprotein particle footprinting experiments or fragmentation of long RNAs. Our approach works across a wide range of input amounts (400 pg to 200 ng), is easy to follow and produces a library in 2-3 days at relatively low reagent cost, all while giving the user complete control over every step. Because all enzymatic reactions were optimized and driven to apparent completion, sequence diversity and species abundance in the input sample are well preserved.
Asunto(s)
Biblioteca de Genes , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Análisis de Secuencia de ARN/métodos , ADN Circular/química , ADN de Cadena Simple/química , Electroforesis en Gel de Poliacrilamida , MicroARNs/química , Reacción en Cadena de la Polimerasa de Transcriptasa InversaRESUMEN
In higher eukaryotes, most mRNAs that encode secreted or membrane-bound proteins contain elements that promote an alternative mRNA nuclear export (ALREX) pathway. Here we report that ALREX-promoting elements also potentiate translation in the presence of upstream nuclear factors. These RNA elements interact directly with, and likely co-evolved with, the zinc finger repeats of RanBP2/Nup358, which is present on the cytoplasmic face of the nuclear pore. Finally we show that RanBP2/Nup358 is not only required for the stimulation of translation by ALREX-promoting elements, but is also required for the efficient global synthesis of proteins targeted to the endoplasmic reticulum (ER) and likely the mitochondria. Thus upon the completion of export, mRNAs containing ALREX-elements likely interact with RanBP2/Nup358, and this step is required for the efficient translation of these mRNAs in the cytoplasm. ALREX-elements thus act as nucleotide platforms to coordinate various steps of post-transcriptional regulation for the majority of mRNAs that encode secreted proteins.
Asunto(s)
Chaperonas Moleculares/fisiología , Proteínas de Complejo Poro Nuclear/fisiología , ARN Mensajero/metabolismo , Retículo Endoplásmico/metabolismo , Glicosilación , Células HeLa , Humanos , Polirribosomas/metabolismo , Biosíntesis de Proteínas , Procesamiento Proteico-Postraduccional , Señales de Clasificación de Proteína , Transporte de Proteínas , Proteínas/genética , Proteínas/metabolismo , Procesamiento Postranscripcional del ARN , Transporte de ARN , ARN Mensajero/genética , Vías Secretoras , Dedos de ZincRESUMEN
SUMMARY: Unlike DNA, RNA abundances can vary over several orders of magnitude. Thus, identification of RNA-protein binding sites from high-throughput sequencing data presents unique challenges. Although peak identification in ChIP-Seq data has been extensively explored, there are few bioinformatics tools tailored for peak calling on analogous datasets for RNA-binding proteins. Here we describe ASPeak (abundance sensitive peak detection algorithm), an implementation of an algorithm that we previously applied to detect peaks in exon junction complex RNA immunoprecipitation in tandem experiments. Our peak detection algorithm yields stringent and robust target sets enabling sensitive motif finding and downstream functional analyses. AVAILABILITY: ASPeak is implemented in Perl as a complete pipeline that takes bedGraph files as input. ASPeak implementation is freely available at https://sourceforge.net/projects/as-peak under the GNU General Public License. ASPeak can be run on a personal computer, yet is designed to be easily parallelizable. ASPeak can also run on high performance computing clusters providing efficient speedup. The documentation and user manual can be obtained from http://master.dl.sourceforge.net/project/as-peak/manual.pdf.
Asunto(s)
Algoritmos , Análisis de Secuencia de ARN/métodos , Programas Informáticos , Internet , Proteínas de Unión al ARN/análisisRESUMEN
Although introns in 5'- and 3'-untranslated regions (UTRs) are found in many protein coding genes, rarely are they considered distinctive entities with specific functions. Indeed, mammalian transcripts with 3'-UTR introns are often assumed nonfunctional because they are subject to elimination by nonsense-mediated decay (NMD). Nonetheless, recent findings indicate that 5'- and 3'-UTR intron status is of significant functional consequence for the regulation of mammalian genes. Therefore these features should be ignored no longer.
Asunto(s)
Regiones no Traducidas 3' , Regiones no Traducidas 5' , Regulación de la Expresión Génica , Intrones , Animales , Humanos , Degradación de ARNm Mediada por Codón sin Sentido , Especificidad de Órganos , ARN Mensajero/metabolismo , Ribonucleoproteínas/genética , Ribonucleoproteínas/metabolismoRESUMEN
In higher eukaryotes, messenger RNAs (mRNAs) are exported from the nucleus to the cytoplasm via factors deposited near the 5' end of the transcript during splicing. The signal sequence coding region (SSCR) can support an alternative mRNA export (ALREX) pathway that does not require splicing. However, most SSCR-containing genes also have introns, so the interplay between these export mechanisms remains unclear. Here we support a model in which the furthest upstream element in a given transcript, be it an intron or an ALREX-promoting SSCR, dictates the mRNA export pathway used. We also experimentally demonstrate that nuclear-encoded mitochondrial genes can use the ALREX pathway. Thus, ALREX can also be supported by nucleotide signals within mitochondrial-targeting sequence coding regions (MSCRs). Finally, we identified and experimentally verified novel motifs associated with the ALREX pathway that are shared by both SSCRs and MSCRs. Our results show strong correlation between 5' untranslated region (5'UTR) intron presence/absence and sequence features at the beginning of the coding region. They also suggest that genes encoding secretory and mitochondrial proteins share a common regulatory mechanism at the level of mRNA export.
Asunto(s)
Regiones no Traducidas 5'/genética , Empalme Alternativo , Núcleo Celular/metabolismo , Transporte de ARN , ARN Mensajero/metabolismo , Transporte Activo de Núcleo Celular , Adenina/metabolismo , Citoplasma , Retículo Endoplásmico/genética , Regulación de la Expresión Génica , Genes Mitocondriales , Humanos , Intrones , Modelos Genéticos , Sistemas de Lectura Abierta , Señales de Clasificación de Proteína , Empalme del ARNRESUMEN
Long-read sequencing technology enables highly accurate detection of allele-specific RNA expression, providing insights into the effects of genetic variation on splicing and RNA abundance. Furthermore, the ability to directly sequence RNA promises the detection of RNA modifications in tandem with ascertaining the allelic origin of each molecule. Here, we leverage these advantages to determine allele-biased patterns of N6-methyladenosine (m6A) modifications in native mRNA. We utilized human and mouse cells with known genetic variants to assign allelic origin of each mRNA molecule combined with a supervised machine learning model to detect read-level m6A modification ratios. Our analyses revealed the importance of sequences adjacent to the DRACH-motif in determining m6A deposition, in addition to allelic differences that directly alter the motif. Moreover, we discovered allele-specific m6A modification (ASM) events with no genetic variants in close proximity to the differentially modified nucleotide, demonstrating the unique advantage of using long reads and surpassing the capabilities of antibody-based short-read approaches. This technological advancement promises to advance our understanding of the role of genetics in determining mRNA modifications.
RESUMEN
Ribosome profiling is a widely-used technique for measuring ribosome occupancy at nucleotide resolution. However, the need to analyze this data at nucleotide resolution introduces unique challenges in data visualization and analyses. In this study, we introduce RiboGraph, a dedicated visualization tool designed to work with .ribo files, a specialized and efficient format for ribosome occupancy data. Unlike existing solutions that rely on large alignment files and time-consuming preprocessing steps, RiboGraph operates on a purpose designed compact file type and eliminates the need for data preprocessing. This efficiency allows for interactive, real-time visualization at ribosome-protected fragment length resolution. By providing an integrated toolset, RiboGraph empowers researchers to conduct comprehensive visual analysis of ribosome occupancy data. Availability and Implementation: Source code, step-by-step installation instructions and links to documentation are available on GitHub: https://github.com/ribosomeprofiling/ribograph. On the same page, we provide test files and a step-by-step tutorial highlighting the key features of RiboGraph.
RESUMEN
The degree to which translational control is specified by mRNA sequence is poorly understood in mammalian cells. Here, we constructed and leveraged a compendium of 3,819 ribosomal profiling datasets, distilling them into a transcriptome-wide atlas of translation efficiency (TE) measurements encompassing >140 human and mouse cell types. We subsequently developed RiboNN, a multitask deep convolutional neural network, and classic machine learning models to predict TEs in hundreds of cell types from sequence-encoded mRNA features, achieving state-of-the-art performance (r=0.79 in human and r=0.78 in mouse for mean TE across cell types). While the majority of earlier models solely considered 5' UTR sequence, RiboNN integrates contributions from the full-length mRNA sequence, learning that the 5' UTR, CDS, and 3' UTR respectively possess ~67%, 31%, and 2% per-nucleotide information density in the specification of mammalian TEs. Interpretation of RiboNN revealed that the spatial positioning of low-level di- and tri-nucleotide features (i.e., including codons) largely explain model performance, capturing mechanistic principles such as how ribosomal processivity and tRNA abundance control translational output. RiboNN is predictive of the translational behavior of base-modified therapeutic RNA, and can explain evolutionary selection pressures in human 5' UTRs. Finally, it detects a common language governing mRNA regulatory control and highlights the interconnectedness of mRNA translation, stability, and localization in mammalian organisms.