RESUMO
An important and largely unsolved problem in synthetic biology is how to target gene expression to specific cell types. Here, we apply iterative deep learning to design synthetic enhancers with strong differential activity between two human cell lines. We initially train models on published datasets of enhancer activity and chromatin accessibility and use them to guide the design of synthetic enhancers that maximize predicted specificity. We experimentally validate these sequences, use the measurements to re-optimize the predictor, and design a second generation of enhancers with improved specificity. Our design methods embed relevant transcription factor binding site (TFBS) motifs with higher frequencies than comparable endogenous enhancers while using a more selective motif vocabulary, and we show that enhancer activity is correlated with transcription factor expression at the single cell level. Finally, we characterize causal features of top enhancers via perturbation experiments and show enhancers as short as 50bp can maintain specificity.
RESUMO
Translation control is essential in balancing hematopoietic precursors and differentiation; however, the mechanisms underlying this program are poorly understood. We found that the activity of the major cap-binding protein eIF4E is unexpectedly regulated in a dynamic manner throughout erythropoiesis that is uncoupled from global protein synthesis rates. Moreover, eIF4E activity directs erythroid maturation, and increased eIF4E expression maintains cells in an early erythroid state associated with a translation program driving the expression of PTPN6 and Igf2bp1. A cytosine-enriched motif in the 5' untranslated region is important for eIF4E-mediated translation specificity. Therefore, selective translation of key target genes necessary for the maintenance of early erythroid states by eIF4E highlights a unique mechanism used by hematopoietic precursors to rapidly elicit erythropoietic maturation upon need.
RESUMO
Therapeutic mRNAs and vaccines are being developed for a broad range of human diseases, including COVID-19. However, their optimization is hindered by mRNA instability and inefficient protein expression. Here, we describe design principles that overcome these barriers. We develop an RNA sequencing-based platform called PERSIST-seq to systematically delineate in-cell mRNA stability, ribosome load, as well as in-solution stability of a library of diverse mRNAs. We find that, surprisingly, in-cell stability is a greater driver of protein output than high ribosome load. We further introduce a method called In-line-seq, applied to thousands of diverse RNAs, that reveals sequence and structure-based rules for mitigating hydrolytic degradation. Our findings show that highly structured "superfolder" mRNAs can be designed to improve both stability and expression with further enhancement through pseudouridine nucleoside modification. Together, our study demonstrates simultaneous improvement of mRNA stability and protein expression and provides a computational-experimental platform for the enhancement of mRNA medicines.
Assuntos
COVID-19 , RNA , COVID-19/terapia , Humanos , Pseudouridina/metabolismo , Estabilidade de RNA/genética , RNA Mensageiro/metabolismoRESUMO
Roles for ribosomal RNA (rRNA) in gene regulation remain largely unexplored. With hundreds of rDNA units positioned across multiple loci, it is not possible to genetically modify rRNA in mammalian cells, hindering understanding of ribosome function. It remains elusive whether expansion segments (ESs), tentacle-like rRNA extensions that vary in sequence and size across eukaryotic evolution, may have functional roles in translation control. Here, we develop variable expansion segment-ligand chimeric ribosome immunoprecipitation RNA sequencing (VELCRO-IP RNA-seq), a versatile methodology to generate species-adapted ESs and to map specific mRNA regions across the transcriptome that preferentially associate with ESs. Application of VELCRO-IP RNA-seq to a mammalian ES, ES9S, identified a large array of transcripts that are selectively recruited to ribosomes via an ES. We further characterize a set of 5' UTRs that facilitate cap-independent translation through ES9S-mediated ribosome binding. Thus, we present a technology for studying the enigmatic ESs of the ribosome, revealing their function in gene-specific translation.
Assuntos
RNA-Seq/métodos , RNA/genética , Ribossomos/genética , Regiões 5' não Traduzidas , Animais , Feminino , Humanos , Imunoprecipitação/métodos , Camundongos , Plasmídeos/genética , Gravidez , Biossíntese de Proteínas , RNA/análise , RNA/metabolismo , Ribossomos/metabolismo , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismoRESUMO
The lack of knowledge about extreme conservation in genomes remains a major gap in our understanding of the evolution of gene regulation. Here, we reveal an unexpected role of extremely conserved 5' untranslated regions (UTRs) in noncanonical translational regulation that is linked to the emergence of essential developmental features in vertebrate species. Endogenous deletion of conserved elements within these 5' UTRs decreased gene expression, and extremely conserved 5' UTRs possess cis-regulatory elements that promote cell-type-specific regulation of translation. We further developed in-cell mutate-and-map (icM2), a new methodology that maps RNA structure inside cells. Using icM2, we determined that an extremely conserved 5' UTR encodes multiple alternative structures and that each single nucleotide within the conserved element maintains the balance of alternative structures important to control the dynamic range of protein expression. These results explain how extreme sequence conservation can lead to RNA-level biological functions encoded in the untranslated regions of vertebrate genomes.
Assuntos
Regiões 5' não Traduzidas/genética , Sequência Conservada/genética , Vertebrados/genética , Animais , Sequência de Bases , Elementos Facilitadores Genéticos/genética , Genoma , Camundongos , Conformação de Ácido Nucleico , Biossíntese de Proteínas , RNA/química , RNA/genéticaRESUMO
Although gene expression is tightly regulated during embryonic development, the impact of translational control has received less experimental attention. Here, we find that eukaryotic translation initiation factor-3 (eIF3) is required for Shh-mediated tissue patterning. Analysis of loss-of-function eIF3 subunit c (Eif3c) mice reveal a unique sensitivity to the Shh receptor patched 1 (Ptch1) dosage. Genome-wide in vivo enhanced cross-linking immunoprecipitation sequence (eCLIP-seq) shows unexpected specificity for eIF3 binding to a pyrimidine-rich motif present in subsets of 5'-UTRs and a corresponding change in the translation of these transcripts by ribosome profiling in Eif3c loss-of-function embryos. We further find a transcript specific effect in Eif3c loss-of-function embryos whereby translation of Ptch1 through this pyrimidine-rich motif is specifically sensitive to eIF3 amount. Altogether, this work uncovers hidden specificity of housekeeping translation initiation machinery for the translation of key developmental signaling transcripts.
Assuntos
Fator de Iniciação 3 em Eucariotos/metabolismo , Biossíntese de Proteínas/fisiologia , Processamento de Proteína Pós-Traducional/fisiologia , Ribossomos/metabolismo , Animais , Linhagem Celular , Fator de Iniciação 3 em Eucariotos/genética , Humanos , Camundongos , RNA Mensageiro/genética , Transdução de Sinais/fisiologiaRESUMO
Therapeutic mRNAs and vaccines are being developed for a broad range of human diseases, including COVID-19. However, their optimization is hindered by mRNA instability and inefficient protein expression. Here, we describe design principles that overcome these barriers. We develop a new RNA sequencing-based platform called PERSIST-seq to systematically delineate in-cell mRNA stability, ribosome load, as well as in-solution stability of a library of diverse mRNAs. We find that, surprisingly, in-cell stability is a greater driver of protein output than high ribosome load. We further introduce a method called In-line-seq, applied to thousands of diverse RNAs, that reveals sequence and structure-based rules for mitigating hydrolytic degradation. Our findings show that "superfolder" mRNAs can be designed to improve both stability and expression that are further enhanced through pseudouridine nucleoside modification. Together, our study demonstrates simultaneous improvement of mRNA stability and protein expression and provides a computational-experimental platform for the enhancement of mRNA medicines.