RESUMO
We set out to exhaustively characterize the impact of the cis-chromatin environment on prime editing, a precise genome engineering tool. Using a highly sensitive method for mapping the genomic locations of randomly integrated reporters, we discover massive position effects, exemplified by editing efficiencies ranging from â¼0% to 94% for an identical target site and edit. Position effects on prime editing efficiency are well predicted by chromatin marks, e.g., positively by H3K79me2 and negatively by H3K9me3. Next, we developed a multiplex perturbational framework to assess the interaction of trans-acting factors with the cis-chromatin environment on editing outcomes. Applying this framework to DNA repair factors, we identify HLTF as a context-dependent repressor of prime editing. Finally, several lines of evidence suggest that active transcriptional elongation enhances prime editing. Consistent with this, we show we can robustly decrease or increase the efficiency of prime editing by preceding it with CRISPR-mediated silencing or activation, respectively.
Assuntos
Sistemas CRISPR-Cas , Cromatina , Epigênese Genética , Edição de Genes , Humanos , Cromatina/metabolismo , Cromatina/genética , Sistemas CRISPR-Cas/genética , Edição de Genes/métodos , Histonas/metabolismo , Fatores de Transcrição/metabolismo , Código das HistonasRESUMO
Coexpression of proteins in response to pathway-inducing signals is the founding paradigm of gene regulation. However, it remains unexplored whether the relative abundance of co-regulated proteins requires precise tuning. Here, we present large-scale analyses of protein stoichiometry and corresponding regulatory strategies for 21 pathways and 67-224 operons in divergent bacteria separated by 0.6-2 billion years. Using end-enriched RNA-sequencing (Rend-seq) with single-nucleotide resolution, we found that many bacterial gene clusters encoding conserved pathways have undergone massive divergence in transcript abundance and architectures via remodeling of internal promoters and terminators. Remarkably, these evolutionary changes are compensated post-transcriptionally to maintain preferred stoichiometry of protein synthesis rates. Even more strikingly, in eukaryotic budding yeast, functionally analogous proteins that arose independently from bacterial counterparts also evolved to convergent in-pathway expression. The broad requirement for exact protein stoichiometries despite regulatory divergence provides an unexpected principle for building biological pathways both in nature and for synthetic activities.
Assuntos
Enzimas/química , Escherichia coli/enzimologia , Evolução Molecular , Isoformas de Proteínas/química , Bacillus subtilis/enzimologia , Bacillus subtilis/genética , Escherichia coli/genética , Regulação Bacteriana da Expressão Gênica , Humanos , Família Multigênica , Óperon , Filogenia , Regiões Promotoras Genéticas , RNA Mensageiro/metabolismo , Ribossomos/química , Análise de Sequência de RNA , TranscriptomaRESUMO
Measurements of gene expression or signal transduction activity are conventionally performed using methods that require either the destruction or live imaging of a biological sample within the timeframe of interest. Here we demonstrate an alternative paradigm in which such biological activities are stably recorded to the genome. Enhancer-driven genomic recording of transcriptional activity in multiplex (ENGRAM) is based on the signal-dependent production of prime editing guide RNAs that mediate the insertion of signal-specific barcodes (symbols) into a genomically encoded recording unit. We show how this strategy can be used for multiplex recording of the cell-type-specific activities of dozens to hundreds of cis-regulatory elements with high fidelity, sensitivity and reproducibility. Leveraging signal transduction pathway-responsive cis-regulatory elements, we also demonstrate time- and concentration-dependent genomic recording of WNT, NF-κB and Tet-On activities. By coupling ENGRAM to sequential genome editing via DNA Typewriter1, we stably record information about the temporal dynamics of two orthogonal signalling pathways to genomic DNA. Finally we apply ENGRAM to integratively record the transient activity of nearly 100 transcription factor consensus motifs across daily windows spanning the differentiation of mouse embryonic stem cells into gastruloids, an in vitro model of early mammalian development. Although these are proof-of-concept experiments and much work remains to fully realize the possibilities, the symbolic recording of biological signals or states within cells, to the genome and over time, has broad potential to complement contemporary paradigms for how we make measurements in biological systems.
Assuntos
DNA , Edição de Genes , Transdução de Sinais , Transcrição Gênica , Animais , Camundongos , Diferenciação Celular/genética , DNA/genética , DNA/metabolismo , Elementos Facilitadores Genéticos/genética , Edição de Genes/métodos , Genômica , Células-Tronco Embrionárias Murinas/citologia , NF-kappa B/metabolismo , Reprodutibilidade dos Testes , RNA Guia de Sistemas CRISPR-Cas/genética , RNA Guia de Sistemas CRISPR-Cas/metabolismo , Transdução de Sinais/genética , Fatores de Tempo , Fatores de Transcrição/metabolismo , Transcrição Gênica/genética , Via de Sinalização Wnt/genética , Motivos de Nucleotídeos , Sequência Consenso/genética , Biologia do Desenvolvimento , Estudo de Prova de ConceitoRESUMO
The inability to scalably and precisely measure the activity of developmental cis-regulatory elements (CREs) in multicellular systems is a bottleneck in genomics. Here we develop a dual RNA cassette that decouples the detection and quantification tasks inherent to multiplex single-cell reporter assays. The resulting measurement of reporter expression is accurate over multiple orders of magnitude, with a precision approaching the limit set by Poisson counting noise. Together with RNA barcode stabilization via circularization, these scalable single-cell quantitative expression reporters provide high-contrast readouts, analogous to classic in situ assays but entirely from sequencing. Screening >200 regions of accessible chromatin in a multicellular in vitro model of early mammalian development, we identify 13 (8 previously uncharacterized) autonomous and cell-type-specific developmental CREs. We further demonstrate that chimeric CRE pairs generate cognate two-cell-type activity profiles and assess gain- and loss-of-function multicellular expression phenotypes from CRE variants with perturbed transcription factor binding sites. Single-cell quantitative expression reporters can be applied in developmental and multicellular systems to quantitatively characterize native, perturbed and synthetic CREs at scale, with high sensitivity and at single-cell resolution.
Assuntos
Regulação da Expressão Gênica no Desenvolvimento , Análise de Célula Única , Análise de Célula Única/métodos , Animais , Camundongos , Genes Reporter , Sequências Reguladoras de Ácido Nucleico , Humanos , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo , Cromatina/genética , Cromatina/metabolismo , Elementos Reguladores de Transcrição , Perfilação da Expressão Gênica/métodosRESUMO
Bacterial protein synthesis rates have evolved to maintain preferred stoichiometries at striking precision, from the components of protein complexes to constituents of entire pathways. Setting relative protein production rates to be well within a factor of two requires concerted tuning of transcription, RNA turnover, and translation, allowing many potential regulatory strategies to achieve the preferred output. The last decade has seen a greatly expanded capacity for precise interrogation of each step of the central dogma genome-wide. Here, we summarize how these technologies have shaped the current understanding of diverse bacterial regulatory architectures underpinning stoichiometric protein synthesis. We focus on the emerging expanded view of bacterial operons, which encode diverse primary and secondary mRNA structures for tuning protein stoichiometry. Emphasis is placed on how quantitative tuning is achieved. We discuss the challenges and open questions in the application of quantitative, genome-wide methodologies to the problem of precise protein production.
Assuntos
Escherichia coli , Óperon , Escherichia coli/genética , Biossíntese de Proteínas , Proteínas/metabolismo , RNA Mensageiro/metabolismo , Transcrição GênicaRESUMO
Tight coupling of transcription and translation is considered a defining feature of bacterial gene expression1,2. The pioneering ribosome can both physically associate and kinetically coordinate with RNA polymerase (RNAP)3-11, forming a signal-integration hub for co-transcriptional regulation that includes translation-based attenuation12,13 and RNA quality control2. However, it remains unclear whether transcription-translation coupling-together with its broad functional consequences-is indeed a fundamental characteristic of bacteria other than Escherichia coli. Here we show that RNAPs outpace pioneering ribosomes in the Gram-positive model bacterium Bacillus subtilis, and that this 'runaway transcription' creates alternative rules for both global RNA surveillance and translational control of nascent RNA. In particular, uncoupled RNAPs in B. subtilis explain the diminished role of Rho-dependent transcription termination, as well as the prevalence of mRNA leaders that use riboswitches and RNA-binding proteins. More broadly, we identified widespread genomic signatures of runaway transcription in distinct phyla across the bacterial domain. Our results show that coupled RNAP-ribosome movement is not a general hallmark of bacteria. Instead, translation-coupled transcription and runaway transcription constitute two principal modes of gene expression that determine genome-specific regulatory mechanisms in prokaryotes.
Assuntos
Bacillus subtilis/genética , Regulação Bacteriana da Expressão Gênica , Biossíntese de Proteínas , Transcrição Gênica , Regiões 5' não Traduzidas/genética , Bacillus subtilis/enzimologia , Bacillus subtilis/metabolismo , RNA Polimerases Dirigidas por DNA/metabolismo , Filogenia , RNA Bacteriano/biossíntese , RNA Bacteriano/metabolismo , RNA Mensageiro/biossíntese , RNA Mensageiro/metabolismo , Proteínas de Ligação a RNA/metabolismo , Fator Rho/metabolismo , Ribossomos/metabolismo , Riboswitch/genéticaRESUMO
Sigma factors are an important class of bacterial transcription factors that lend specificity to RNA polymerases by binding to distinct promoter elements for genes in their regulons. Here we show that activation of the general stress sigma factor, σB, in Bacillus subtilis paradoxically leads to dramatic induction of translation for a subset of its regulon genes. These genes are translationally repressed when transcribed by the housekeeping sigma factor, σA, owing to extended RNA secondary structures as determined in vivo using DMS-MaPseq. Transcription from σB-dependent promoters ablates the secondary structures and activates translation, leading to dual induction. Translation efficiencies between σB- and σA-dependent RNA isoforms can vary by up to 100-fold, which in multiple cases exceeds the magnitude of transcriptional induction. These results highlight the role of long-range RNA folding in modulating translation and demonstrate that a transcription factor can regulate protein synthesis beyond its effects on transcript levels.
RESUMO
During steady-state cell growth, individual enzymatic fluxes can be directly inferred from growth rate by mass conservation, but the inverse problem remains unsolved. Perturbing the flux and expression of a single enzyme could have pleiotropic effects that may or may not dominate the impact on cell fitness. Here, we quantitatively dissect the molecular and global responses to varied expression of translation termination factors (peptide release factors, RFs) in the bacterium Bacillus subtilis. While endogenous RF expression maximizes proliferation, deviations in expression lead to unexpected distal regulatory responses that dictate fitness reduction. Molecularly, RF depletion causes expression imbalance at specific operons, which activates master regulators and detrimentally overrides the transcriptome. Through these spurious connections, RF abundances are thus entrenched by focal points within the regulatory network, in one case located at a single stop codon. Such regulatory entrenchment suggests that predictive bottom-up models of expression-fitness landscapes will require near-exhaustive characterization of parts.
Assuntos
Bacillus subtilis/genética , Regulação Bacteriana da Expressão Gênica , Fatores de Terminação de Peptídeos/metabolismo , Biossíntese de Proteínas , Proteínas de Bactérias/metabolismo , Sequência de Bases , Sistemas CRISPR-Cas/genética , Genoma Bacteriano , Proteoma/metabolismo , Estresse Fisiológico/genética , Transcrição GênicaRESUMO
Accurate measurements of cellular protein concentrations are invaluable to quantitative studies of gene expression and physiology in living cells. Here, we developed a versatile mass spectrometric workflow based on data-independent acquisition proteomics (DIA/SWATH) together with a novel protein inference algorithm (xTop). We used this workflow to accurately quantify absolute protein abundances in Escherichia coli for > 2,000 proteins over > 60 growth conditions, including nutrient limitations, non-metabolic stresses, and non-planktonic states. The resulting high-quality dataset of protein mass fractions allowed us to characterize proteome responses from a coarse (groups of related proteins) to a fine (individual) protein level. Hereby, a plethora of novel biological findings could be elucidated, including the generic upregulation of low-abundant proteins under various metabolic limitations, the non-specificity of catabolic enzymes upregulated under carbon limitation, the lack of large-scale proteome reallocation under stress compared to nutrient limitations, as well as surprising strain-dependent effects important for biofilm formation. These results present valuable resources for the systems biology community and can be used for future multi-omics studies of gene regulation and metabolic control in E. coli.
Assuntos
Proteínas de Escherichia coli/metabolismo , Escherichia coli/crescimento & desenvolvimento , Proteômica/métodos , Algoritmos , Técnicas Bacteriológicas , Escherichia coli/metabolismo , Espectrometria de Massas , Estresse Fisiológico , Biologia de Sistemas , Fluxo de TrabalhoRESUMO
Endonucleolytic cleavage within polycistronic mRNAs can lead to differential stability, and thus discordant abundance, among cotranscribed genes. RNase Y, the major endonuclease for mRNA decay in Bacillus subtilis, was originally identified for its cleavage activity toward the cggR-gapA operon, an event that differentiates the synthesis of a glycolytic enzyme from its transcriptional regulator. A three-protein Y-complex (YlbF, YmcA, and YaaT) was recently identified as also being required for this cleavage in vivo, raising the possibility that it is an accessory factor acting to regulate RNase Y. However, whether the Y-complex is broadly required for RNase Y activity is unknown. Here, we used end-enrichment RNA sequencing (Rend-seq) to globally identify operon mRNAs that undergo maturation posttranscriptionally by RNase Y and the Y-complex. We found that the Y-complex is required for the majority of RNase Y-mediated mRNA maturation events and also affects riboswitch abundance in B. subtilis In contrast, noncoding RNA maturation by RNase Y often does not require the Y-complex. Furthermore, deletion of RNase Y has more pleiotropic effects on the transcriptome and cell growth than deletions of the Y-complex. We propose that the Y-complex is a specificity factor for RNase Y, with evidence that its role is conserved in Staphylococcus aureus.
Assuntos
Bacillus subtilis/metabolismo , Endorribonucleases/metabolismo , RNA Mensageiro/metabolismo , Ribonucleases/metabolismo , Proteínas de Bactérias/metabolismo , Regulação Bacteriana da Expressão Gênica/fisiologia , Óperon/fisiologia , Processamento Pós-Transcricional do RNA/fisiologia , RNA não Traduzido/metabolismo , Staphylococcus aureus/metabolismo , Transcriptoma/fisiologiaRESUMO
Variability in the chemical composition of the extracellular environment can significantly degrade the ability of cells to detect rare cognate ligands. Using concepts from statistical detection theory, we formalize the generic problem of detection of small concentrations of ligands in a fluctuating background of biochemically similar ligands binding to the same receptors. We discover that in contrast with expectations arising from considerations of signal amplification, inhibitory interactions between receptors can improve detection performance in the presence of substantial environmental variability, providing an adaptive interpretation to the phenomenon of ligand antagonism. Our results suggest that the structure of signaling pathways responsible for chemodetection in fluctuating and heterogeneous environments might be optimized with respect to the statistics and dynamics of environmental composition. The developed formalism stresses the importance of characterizing nonspecific interactions to understand function in signaling pathways.
Assuntos
Meio Ambiente , Ligantes , Modelos Biológicos , Receptor Cross-Talk/fisiologia , Receptores de Superfície Celular/metabolismo , Transdução de Sinais/fisiologia , Simulação por ComputadorRESUMO
The functional consequences of structural variants (SVs) in mammalian genomes are challenging to study. This is due to several factors, including: 1) their numerical paucity relative to other forms of standing genetic variation such as single nucleotide variants (SNVs) and short insertions or deletions (indels); 2) the fact that a single SV can involve and potentially impact the function of more than one gene and/or cis regulatory element; and 3) the relative immaturity of methods to generate and map SVs, either randomly or in targeted fashion, in in vitro or in vivo model systems. Towards addressing these challenges, we developed Genome-Shuffle-seq, a straightforward method that enables the multiplex generation and mapping of several major forms of SVs (deletions, inversions, translocations) throughout a mammalian genome. Genome-Shuffle-seq is based on the integration of "shuffle cassettes" to the genome, wherein each shuffle cassette contains components that facilitate its site-specific recombination (SSR) with other integrated shuffle cassettes (via Cre-loxP), its mapping to a specific genomic location (via T7-mediated in vitro transcription or IVT), and its identification in single-cell RNA-seq (scRNA-seq) data (via T7-mediated in situ transcription or IST). In this proof-of-concept, we apply Genome-Shuffle-seq to induce and map thousands of genomic SVs in mouse embryonic stem cells (mESCs) in a single experiment. Induced SVs are rapidly depleted from the cellular population over time, possibly due to Cre-mediated toxicity and/or negative selection on the rearrangements themselves. Leveraging T7 IST of barcodes whose positions are already mapped, we further demonstrate that we can efficiently genotype which SVs are present in association with each of many single cell transcriptomes in scRNA-seq data. Finally, preliminary evidence suggests our method may be a powerful means of generating extrachromosomal circular DNAs (ecDNAs). Looking forward, we anticipate that Genome-Shuffle-seq may be broadly useful for the systematic exploration of the functional consequences of SVs on gene expression, the chromatin landscape, and 3D nuclear architecture. We further anticipate potential uses for in vitro modeling of ecDNAs, as well as in paving the path to a minimal mammalian genome.
RESUMO
CRISPR-based gene activation (CRISPRa) is a strategy for upregulating gene expression by targeting promoters or enhancers in a tissue/cell-type specific manner. Here, we describe an experimental framework that combines highly multiplexed perturbations with single-cell RNA sequencing (sc-RNA-seq) to identify cell-type-specific, CRISPRa-responsive cis-regulatory elements and the gene(s) they regulate. Random combinations of many gRNAs are introduced to each of many cells, which are then profiled and partitioned into test and control groups to test for effect(s) of CRISPRa perturbations of both enhancers and promoters on the expression of neighboring genes. Applying this method to a library of 493 gRNAs targeting candidate cis-regulatory elements in both K562 cells and iPSC-derived excitatory neurons, we identify gRNAs capable of specifically upregulating intended target genes and no other neighboring genes within 1 Mb, including gRNAs yielding upregulation of six autism spectrum disorder (ASD) and neurodevelopmental disorder (NDD) risk genes in neurons. A consistent pattern is that the responsiveness of individual enhancers to CRISPRa is restricted by cell type, implying a dependency on either chromatin landscape and/or additional trans-acting factors for successful gene activation. The approach outlined here may facilitate large-scale screens for gRNAs that activate genes in a cell type-specific manner.
Assuntos
Sistemas CRISPR-Cas , Elementos Facilitadores Genéticos , Análise de Célula Única , Humanos , Análise de Célula Única/métodos , Células K562 , Elementos Facilitadores Genéticos/genética , Regiões Promotoras Genéticas/genética , RNA Guia de Sistemas CRISPR-Cas/genética , Transtorno do Espectro Autista/genética , Neurônios/metabolismo , Células-Tronco Pluripotentes Induzidas/metabolismo , Células-Tronco Pluripotentes Induzidas/citologia , Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas/genéticaRESUMO
Many biological networks have to filter out useful information from a vast excess of spurious interactions. In this Letter, we use computational evolution to predict design features of networks processing ligand categorization. The important problem of early immune response is considered as a case study. Rounds of evolution with different constraints uncover elaborations of the same network motif we name "adaptive sorting." Corresponding network substructures can be identified in current models of immune recognition. Our work draws a deep analogy between immune recognition and biochemical adaptation.
Assuntos
Algoritmos , Células Apresentadoras de Antígenos/imunologia , Modelos Biológicos , Modelos Imunológicos , Linfócitos T/imunologia , Simulação por Computador , Ligantes , Complexo Principal de Histocompatibilidade/imunologia , Receptores de Antígenos de Linfócitos T/imunologiaRESUMO
The ability to profile transcriptomes and characterize global gene expression changes has been greatly enabled by the development of RNA sequencing technologies (RNA-seq). However, the process of generating sequencing-compatible cDNA libraries from RNA samples can be time-consuming and expensive, especially for bacterial mRNAs which lack poly(A)-tails that are often used to streamline this process for eukaryotic samples. Compared to the increasing throughput and decreasing cost of sequencing, library preparation has had limited advances. Here, we describe bacterial-multiplexed-seq (BaM-seq), an approach that enables simple barcoding of many bacterial RNA samples that decreases the time and cost of library preparation. We also present targeted-bacterial-multiplexed-seq (TBaM-seq) that allows for differential expression analysis of specific gene panels with over 100-fold enrichment in read coverage. In addition, we introduce the concept of transcriptome redistribution based on TBaM-seq that dramatically reduces the required sequencing depth while still allowing for quantification of both highly and lowly abundant transcripts. These methods accurately measure gene expression changes with high technical reproducibility and agreement with gold standard, lower throughput approaches. Together, use of these library preparation protocols allows for fast, affordable generation of sequencing libraries.
RESUMO
Prime editing is a powerful means of introducing precise changes to specific locations in mammalian genomes. However, the widely varying efficiency of prime editing across target sites of interest has limited its adoption in the context of both basic research and clinical settings. Here, we set out to exhaustively characterize the impact of the cis- chromatin environment on prime editing efficiency. Using a newly developed and highly sensitive method for mapping the genomic locations of a randomly integrated "sensor", we identify specific epigenetic features that strongly correlate with the highly variable efficiency of prime editing across different genomic locations. Next, to assess the interaction of trans -acting factors with the cis -chromatin environment, we develop and apply a pooled genetic screening approach with which the impact of knocking down various DNA repair factors on prime editing efficiency can be stratified by cis -chromatin context. Finally, we demonstrate that we can dramatically modulate the efficiency of prime editing through epigenome editing, i.e. altering chromatin state in a locus-specific manner in order to increase or decrease the efficiency of prime editing at a target site. Looking forward, we envision that the insights and tools described here will broaden the range of both basic research and therapeutic contexts in which prime editing is useful.
RESUMO
Enzymatic pathways have evolved uniquely preferred protein expression stoichiometry in living cells, but our ability to predict the optimal abundances from basic properties remains underdeveloped. Here, we report a biophysical, first-principles model of growth optimization for core mRNA translation, a multi-enzyme system that involves proteins with a broadly conserved stoichiometry spanning two orders of magnitude. We show that predictions from maximization of ribosome usage in a parsimonious flux model constrained by proteome allocation agree with the conserved ratios of translation factors. The analytical solutions, without free parameters, provide an interpretable framework for the observed hierarchy of expression levels based on simple biophysical properties, such as diffusion constants and protein sizes. Our results provide an intuitive and quantitative understanding for the construction of a central process of life, as well as a path toward rational design of pathway-specific enzyme expression stoichiometry.
Assuntos
Bactérias/enzimologia , Enzimas/química , Biossíntese de Proteínas , Bactérias/genética , Bactérias/metabolismo , Regulação Bacteriana da Expressão Gênica , Modelos Teóricos , Proteoma/metabolismo , Ribossomos/fisiologiaRESUMO
Aminoacyl-tRNA synthetases (aaRSs) serve a dual role in charging tRNAs. Their enzymatic activities both provide protein synthesis flux and reduce uncharged tRNA levels. Although uncharged tRNAs can negatively impact bacterial growth, substantial concentrations of tRNAs remain deacylated even under nutrient-rich conditions. Here, we show that tRNA charging in Bacillus subtilis is not maximized due to optimization of aaRS production during rapid growth, which prioritizes demands in protein synthesis over charging levels. The presence of uncharged tRNAs is alleviated by precisely tuned translation kinetics and the stringent response, both insensitive to aaRS overproduction but sharply responsive to underproduction, allowing for just enough aaRS production atop a "fitness cliff." Notably, we find that the stringent response mitigates fitness defects at all aaRS underproduction levels even without external starvation. Thus, adherence to minimal, flux-satisfying protein production drives limited tRNA charging and provides a basis for the sensitivity and setpoints of an integrated growth-control network.
Assuntos
Aminoacil-tRNA Sintetases/genética , RNA de Transferência/genética , HumanosRESUMO
RNA polymerases (RNAPs) transcribe genes through a cycle of recruitment to promoter DNA, initiation, elongation, and termination. After termination, RNAP is thought to initiate the next round of transcription by detaching from DNA and rebinding a new promoter. Here we use single-molecule fluorescence microscopy to observe individual RNAP molecules after transcript release at a terminator. Following termination, RNAP almost always remains bound to DNA and sometimes exhibits one-dimensional sliding over thousands of basepairs. Unexpectedly, the DNA-bound RNAP often restarts transcription, usually in reverse direction, thus producing an antisense transcript. Furthermore, we report evidence of this secondary initiation in live cells, using genome-wide RNA sequencing. These findings reveal an alternative transcription cycle that allows RNAP to reinitiate without dissociating from DNA, which is likely to have important implications for gene regulation.