RESUMEN
We demonstrate a transcriptional regulatory design algorithm that can boost expression in yeast and mammalian cell lines. The system consists of a simplified transcriptional architecture composed of a minimal core promoter and a synthetic upstream regulatory region (sURS) composed of up to three motifs selected from a list of 41 motifs conserved in the eukaryotic lineage. The sURS system was first characterized using an oligo-library containing 189,990 variants. We validate the resultant expression model using a set of 43 unseen sURS designs. The validation sURS experiments indicate that a generic set of grammar rules for boosting and attenuation may exist in yeast cells. Finally, we demonstrate that this generic set of grammar rules functions similarly in mammalian CHO-K1 and HeLa cells. Consequently, our work provides a design algorithm for boosting the expression of promoters used for expressing industrially relevant proteins in yeast and mammalian cell lines.
Asunto(s)
Células Eucariotas , Saccharomyces cerevisiae , Animales , Humanos , Saccharomyces cerevisiae/genética , Células HeLa , Regiones Promotoras Genéticas/genética , Expresión Génica , Mamíferos/genéticaRESUMEN
Liquid-solid transition, also known as gelation, is a specific form of phase separation in which molecules cross-link to form a highly interconnected compartment with solid - like dynamical properties. Here, we utilize RNA hairpin coat-protein binding sites to form synthetic RNA based gel-like granules via liquid-solid phase transition. We show both in-vitro and in-vivo that hairpin containing synthetic long non-coding RNA (slncRNA) molecules granulate into bright localized puncta. We further demonstrate that upon introduction of the coat-proteins, less-condensed gel-like granules form with the RNA creating an outer shell with the proteins mostly present inside the granule. Moreover, by tracking puncta fluorescence signals over time, we detected addition or shedding events of slncRNA-CP nucleoprotein complexes. Consequently, our granules constitute a genetically encoded storage compartment for protein and RNA with a programmable controlled release profile that is determined by the number of hairpins encoded into the RNA. Our findings have important implications for the potential regulatory role of naturally occurring granules and for the broader biotechnology field.
Asunto(s)
Bacteriófagos , ARN , ARN/metabolismo , Bacteriófagos/metabolismo , Proteínas/metabolismo , Gránulos Citoplasmáticos/metabolismoRESUMEN
Understanding the grammar of enhancers and how they regulate gene expression is key for both basic research and for the pharma and biotech industries. The design and characterization of synthetic enhancers can expand the known regulatory space. This is achieved by the utilization of DNA Oligo Libraries (OLs), which facilitates screening of as many as millions of synthetic enhancer variants simultaneously. This review includes the latest commercial DNA OL synthesis technology and its capabilities, and a general 'know-how' guide for the design, construction, and analysis of OL-based synthetic enhancer characterization experiments. Specifically, we focus on synthetic-enhancer-based massively parallel reporter assay, Sort-seq methodologies (e.g. flow cytometry, deep sequencing), and a brief description of machine learning-based attempts for OL-analysis and follow-up validation experiments.
Asunto(s)
ADN , Elementos de Facilitación Genéticos , ADN/genética , Elementos de Facilitación Genéticos/genéticaRESUMEN
We present a cell-free assay for rapid screening of candidate inhibitors of protein binding, focusing on inhibition of the interaction between the SARS-CoV-2 Spike receptor binding domain (RBD) and human angiotensin-converting enzyme 2 (hACE2). The assay has two components: fluorescent polystyrene particles covalently coated with RBD, termed virion-particles (v-particles), and fluorescently labeled hACE2 (hACE2F) that binds the v-particles. When incubated with an inhibitor, v-particle-hACE2F binding is diminished, resulting in a reduction in the fluorescent signal of bound hACE2F relative to the noninhibitor control, which can be measured via flow cytometry or fluorescence microscopy. We determine the amount of RBD needed for v-particle preparation, v-particle incubation time with hACE2F, hACE2F detection limit, and specificity of v-particle binding to hACE2F. We measure the dose response of the v-particles to known inhibitors. Finally, utilizing an RNA-binding protein tdPP7 incorporated into hACE2F, we demonstrate that RNA-hACE2F granules trap v-particles effectively, providing a basis for potential RNA-hACE2F therapeutics.
Asunto(s)
Enzima Convertidora de Angiotensina 2 , Antivirales , SARS-CoV-2 , Glicoproteína de la Espiga del Coronavirus , Enzima Convertidora de Angiotensina 2/antagonistas & inhibidores , Antivirales/farmacología , COVID-19 , Humanos , Unión Proteica , ARN/metabolismo , SARS-CoV-2/efectos de los fármacos , Glicoproteína de la Espiga del Coronavirus/antagonistas & inhibidoresRESUMEN
We present Triplex-seq, a deep-sequencing method that systematically maps the interaction space between an oligo library of ssDNA triplex-forming oligos (TFOs) and a particular dsDNA triplex target site (TTS). We demonstrate the method using a randomized oligo library comprising 67 million variants, with five TTSs that differ in guanine (G) content, at two different buffer conditions, denoted pH 5 and pH 7. Our results show that G-rich triplexes form at both pH 5 and pH 7, with the pH 5 set being more stable, indicating that there is a subset of TFOs that form triplexes only at pH 5. In addition, using information analysis, we identify triplex-forming motifs (TFMs), which correspond to minimal functional TFO sequences. We demonstrate, in single-variant verification experiments, that TFOs with these TFMs indeed form a triplex with G-rich TTSs, and that a single mutation in the TFM motif can alleviate binding. Our results show that deep-sequencing platforms can substantially expand our understanding of triplex binding rules and aid in refining the DNA triplex code.
Asunto(s)
ADN/química , Biblioteca de Genes , Secuenciación de Nucleótidos de Alto Rendimiento , Oligodesoxirribonucleótidos/química , ADN/genética , Concentración de Iones de Hidrógeno , Oligodesoxirribonucleótidos/genéticaRESUMEN
We apply an oligo-library and machine learning-approach to characterize the sequence and structural determinants of binding of the phage coat proteins (CPs) of bacteriophages MS2 (MCP), PP7 (PCP), and Qß (QCP) to RNA. Using the oligo library, we generate thousands of candidate binding sites for each CP, and screen for binding using a high-throughput dose-response Sort-seq assay (iSort-seq). We then apply a neural network to expand this space of binding sites, which allowed us to identify the critical structural and sequence features for binding of each CP. To verify our model and experimental findings, we design several non-repetitive binding site cassettes and validate their functionality in mammalian cells. We find that the binding of each CP to RNA is characterized by a unique space of sequence and structural determinants, thus providing a more complete description of CP-RNA interaction as compared with previous low-throughput findings. Finally, based on the binding spaces we demonstrate a computational tool for the successful design and rapid synthesis of functional non-repetitive binding-site cassettes.
Asunto(s)
Allolevivirus/genética , Proteínas de la Cápside/metabolismo , Escherichia coli/virología , Levivirus/genética , ARN/metabolismo , Sitios de Ligazón Microbiológica/genética , Sitios de Unión/genética , Línea Celular Tumoral , Escherichia coli/genética , Biblioteca de Genes , Humanos , Aprendizaje Automático , Plásmidos/genéticaRESUMEN
An amendment to this paper has been published and can be accessed via a link at the top of the paper.
RESUMEN
The density and long-term stability of DNA make it an appealing storage medium, particularly for long-term data archiving. Existing DNA storage technologies involve the synthesis and sequencing of multiple nominally identical molecules in parallel, resulting in information redundancy. We report the development of encoding and decoding methods that exploit this redundancy using composite DNA letters. A composite DNA letter is a representation of a position in a sequence that consists of a mixture of all four DNA nucleotides in a predetermined ratio. Our methods encode data using fewer synthesis cycles. We encode 6.4 MB into composite DNA, with distinguishable composition medians, using 20% fewer synthesis cycles per unit of data, as compared to previous reports. We also simulate encoding with larger composite alphabets, with distinguishable composition deciles, to show that 75% fewer synthesis cycles are potentially sufficient. We describe applicable error-correcting codes and inference methods, and investigate error patterns in the context of composite DNA letters.
Asunto(s)
ADN/síntesis química , Algoritmos , Secuencia de Bases , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Almacenamiento y Recuperación de la Información , Análisis de Secuencia de ADN/métodosRESUMEN
In the initiation step of protein translation, the ribosome binds to the initiation region of the mRNA. Translation initiation can be blocked by binding of an RNA binding protein (RBP) to the initiation region of the mRNA, which interferes with ribosome binding. In the presented method, we utilize this blocking phenomenon to quantify the binding affinity of RBPs to their cognate and non-cognate binding sites. To do this, we insert a test binding site in the initiation region of a reporter mRNA and induce the expression of the test RBP. In the case of RBP-RNA binding, we observed a sigmoidal repression of the reporter expression as a function of RBP concentration. In the case of no-affinity or very low affinity between binding site and RBP, no significant repression was observed. The method is carried out in live bacterial cells, and does not require expensive or sophisticated machinery. It is useful for quantifying and comparing between the binding affinities of different RBPs that are functional in bacteria to a set of designed binding sites. This method may be inappropriate for binding sites with high structural complexity. This is due to the possibility of repression of ribosomal initiation by complex mRNA structure in the absence of RBP, which would result in lower basal reporter gene expression, and thus less-observable reporter repression upon RBP binding.
Asunto(s)
Secuencia de Aminoácidos/genética , Bacterias/genética , Secuencia de Bases/genética , Proteínas de Unión al ARN/metabolismoRESUMEN
The construction of complex gene-regulatory networks requires both inhibitory and upregulatory modules. However, the vast majority of RNA-based regulatory "parts" are inhibitory. Using a synthetic biology approach combined with SHAPE-seq, we explored the regulatory effect of RNA-binding protein (RBP)-RNA interactions in bacterial 5' UTRs. By positioning a library of RNA hairpins upstream of a reporter gene and co-expressing them with the matching RBP, we observed a set of regulatory responses, including translational stimulation, translational repression, and cooperative behavior. Our combined approach revealed three distinct states in vivo: in the absence of RBPs, the RNA molecules can be found in either a molten state that is amenable to translation or a structured phase that inhibits translation. In the presence of RBPs, the RNA molecules are in a semi-structured phase with partial translational capacity. Our work provides new insight into RBP-based regulation and a blueprint for designing complete gene-regulatory circuits at the post-transcriptional level.
Asunto(s)
Regiones no Traducidas 5'/genética , Modelos Biológicos , Proteínas de Unión al ARN/metabolismo , ARN/genética , Animales , Regulación hacia Abajo , Redes Reguladoras de Genes , Humanos , Modelos Teóricos , Conformación Molecular , Unión Proteica , Procesamiento Postranscripcional del ARN , Proteínas de Unión al ARN/genética , Relación Estructura-Actividad , Biología Sintética , Regulación hacia ArribaRESUMEN
Whole cell bioreporters, such as bacterial cells, can be used for environmental and clinical sensing of specific analytes. However, the current methods implemented to observe such bioreporters in the form of chemotactic responses heavily rely on microscope analysis, fluorescent labels, and hard-to-scale microfluidic devices. Herein, we demonstrate that chemotaxis can be detected within minutes using intrinsic optical measurements of silicon femtoliter well arrays (FMAs). This is done via phase-shift reflectometric interference spectroscopic measurements (PRISM) of the wells, which act as silicon diffraction gratings, enabling label-free, real-time quantification of the number of trapped bacteria cells in the optical readout. By generating unsteady chemical gradients over the wells, we first demonstrate that chemotaxis toward attractants and away from repellents can be easily differentiated based on the signal response of PRISM. The lowest concentration of chemorepellent to elicit an observed bacterial response was 50 mM, whereas the lowest concentration of chemoattractant to elicit a response was 10 mM. Second, we employed PRISM, in combination with a computational approach, to rapidly scan for and identify a novel synthetic histamine chemoreceptor strain. Consequently, we show that by using a combined computational design approach, together with a quantitative, real-time, and label-free detection method, it is possible to manufacture and characterize novel synthetic chemoreceptors in Escherichia coli (E. coli).
RESUMEN
We study translation repression in bacteria by engineering a regulatory circuit that functions as a binding assay for RNA binding proteins (RBP) in vivo. We do so by inducing expression of a fluorescent protein-RBP chimera, together with encoding its binding site at various positions within the ribosomal initiation region (+11-13 nt from the AUG) of a reporter module. We show that when bound by their cognate RBPs, the phage coat proteins for PP7 (PCP) and Qß (QCP), strong repression is observed for all hairpin positions within the initiation region. Yet, a sharp transition to no-effect is observed when positioned in the elongation region, at a single-nucleotide resolution. Employing in vivo Selective 2'-hydroxyl acylation analyzed by primer extension followed by sequencing (SHAPE-seq) for a representative construct, established that in the translationally active state the mRNA molecule is nonstructured, while in the repressed state a structured signature was detected. We then utilize this regulatory phenomena to quantify the binding affinity of the coat proteins of phages MS2, PP7, GA, and Qß to 14 cognate and noncognate binding sites in vivo. Using our circuit, we demonstrate qualitative differences between in vitro to in vivo binding characteristics for various variants when comparing to past studies. Furthermore, by introducing a simple mutation to the loop region for the Qß-wt site, MCP binding is abolished, creating the first high-affinity QCP site that is completely orthogonal to MCP. Consequently, we demonstrate that our hybrid transcriptional-post-transcriptional circuit can be utilized as a binding assay to quantify RNA-RBP interactions in vivo.
Asunto(s)
Genes Reporteros , Proteínas de Unión al ARN/metabolismo , Bacterias/metabolismo , Bacteriófagos/metabolismo , Sitios de Unión , Bioensayo , Proteínas de la Cápside/genética , Proteínas de la Cápside/metabolismo , Secuencias Invertidas Repetidas , Conformación de Ácido Nucleico , Plásmidos/genética , Plásmidos/metabolismo , ARN Mensajero/química , ARN Mensajero/metabolismo , Proteínas de Unión al ARN/genéticaRESUMEN
We use an oligonucleotide library of >10,000 variants to identify an insulation mechanism encoded within a subset of σ54 promoters. Insulation manifests itself as reduced protein expression for a downstream gene that is expressed by transcriptional readthrough. It is strongly associated with the presence of short CT-rich motifs (3-5 bp), positioned within 25 bp upstream of the Shine-Dalgarno (SD) motif of the silenced gene. We provide evidence that insulation is triggered by binding of the ribosome binding site (RBS) to the upstream CT-rich motif. We also show that, in E. coli, insulator sequences are preferentially encoded within σ54 promoters, suggesting an important regulatory role for these sequences in natural contexts. Our findings imply that sequence-specific regulatory effects that are sparsely encoded by short motifs may not be easily detected by lower throughput studies. Such sequence-specific phenomena can be uncovered with a focused oligo library (OL) design that mitigates sequence-related variance, as exemplified herein.
Asunto(s)
Escherichia coli/genética , Biblioteca de Genes , Elementos Aisladores/genética , Regiones Promotoras Genéticas , Análisis de Secuencia de ADN , Factor sigma/genética , Sitios de Unión/genética , Regulación hacia Abajo/genética , Regulación Bacteriana de la Expresión Génica , Silenciador del Gen , Genoma Bacteriano , Mutación/genética , Motivos de Nucleótidos/genética , Ribosomas/metabolismoRESUMEN
We model the regulatory role of proteins bound to looped DNA using a simulation in which dsDNA is represented as a self-avoiding chain, and proteins as spherical protrusions. We simulate long self-avoiding chains using a sequential importance sampling Monte-Carlo algorithm, and compute the probabilities for chain looping with and without a protrusion. We find that a protrusion near one of the chain's termini reduces the probability of looping, even for chains much longer than the protrusion-chain-terminus distance. This effect increases with protrusion size, and decreases with protrusion-terminus distance. The reduced probability of looping can be explained via an eclipse-like model, which provides a novel inhibitory mechanism. We test the eclipse model on two possible transcription-factor occupancy states of the D. melanogaster eve 3/7 enhancer, and show that it provides a possible explanation for the experimentally-observed eve stripe 3 and 7 expression patterns.
Asunto(s)
ADN , Elementos de Facilitación Genéticos/genética , Modelos Genéticos , Método de Montecarlo , Animales , Biología Computacional , ADN/química , ADN/metabolismo , Drosophila melanogaster/genética , Factores de Transcripción/genéticaRESUMEN
We explore a model for 'quenching-like' repression by studying synthetic bacterial enhancers, each characterized by a different binding site architecture. To do so, we take a three-pronged approach: first, we compute the probability that a protein-bound dsDNA molecule will loop. Second, we use hundreds of synthetic enhancers to test the model's predictions in bacteria. Finally, we verify the mechanism bioinformatically in native genomes. Here we show that excluded volume effects generated by DNA-bound proteins can generate substantial quenching. Moreover, the type and extent of the regulatory effect depend strongly on the relative arrangement of the binding sites. The implications of these results are that enhancers should be insensitive to 10-11 bp insertions or deletions (INDELs) and sensitive to 5-6 bp INDELs. We test this prediction on 61 σ(54)-regulated qrr genes from the Vibrio genus and confirm the tolerance of these enhancers' sequences to the DNA's helical repeat.
Asunto(s)
ADN Bacteriano/química , Elementos de Facilitación Genéticos , Biología Sintética/métodos , Bacterias/genética , Regulación Bacteriana de la Expresión Génica/fisiología , Modelos Biológicos , Regiones Promotoras Genéticas , Percepción de QuorumRESUMEN
We compute the effects of excluded volume on the probability for double-stranded DNA to form a loop. We utilize a Monte Carlo algorithm for generation of large ensembles of self-avoiding wormlike chains, which are used to compute the J factor for varying length scales. In the entropic regime, we confirm the scaling-theory prediction of a power-law drop off of -1.92, which is significantly stronger than the -1.5 power law predicted by the non-self-avoiding wormlike chain model. In the elastic regime, we find that the angle-independent end-to-end chain distribution is highly anisotropic. This anisotropy, combined with the excluded volume constraints, leads to an increase in the J factor of the self-avoiding wormlike chain by about half an order of magnitude relative to its non-self-avoiding counterpart. This increase could partially explain the anomalous results of recent cyclization experiments, in which short dsDNA molecules were found to have an increased propensity to form a loop.
RESUMEN
One of the greatest challenges facing synthetic biology is to develop a technology that allows gene regulatory circuits in microbes to integrate multiple inputs or stimuli using a small DNA sequence "foot-print", and which will generate precise and reproducible outcomes. Achieving this goal is hindered by the routine utilization of the commonplace σ(70) promoters in gene-regulatory circuits. These promoters typically are not capable of integrating binding of more than two or three transcription factors in natural examples, which has limited the field to developing integrated circuits made of two-input biological "logic" gates. In natural examples the regulatory elements, which integrate multiple inputs are called enhancers. These regulatory elements are ubiquitous in all organisms in the tree of life, and interestingly metazoan and bacterial enhancers are significantly more similar in terms of both Transcription Factor binding site arrangement and biological function than previously thought. These similarities imply that there may be underlying enhancer design principles or grammar rules by which one can engineer novel gene regulatory circuits. However, at present our current understanding of enhancer structure-function relationship in all organisms is limited, thus preventing us from using these objects routinely in synthetic biology application. In order to alleviate this problem, in this book chapter, I will review our current view of bacterial enhancers, allowing us to first highlight the potential of enhancers to be a game-changing tool in synthetic biology application, and subsequently to draw a road-map for developing the necessary quantitative understanding to reach this goal.
Asunto(s)
Bacterias/genética , Elementos de Facilitación Genéticos , Redes Reguladoras de Genes/genética , Genes Sintéticos , Animales , Biología Computacional/métodos , Biología Computacional/tendencias , Regulación Bacteriana de la Expresión Génica , Humanos , Biología Sintética/tendenciasRESUMEN
Protein-protein interactions play an important role in determining the regulatory output of cis regulatory regions. In this work, we revisit the regulatory output functions recorded for the synthetic enhancers that contain binding sites for TetR. We use our thermodynamic model as an analysis tool to infer that two different types of interactions may take place between the TetR molecules. First, a strong mutually exclusive anti-cooperative interaction precludes the synthetic enhancer from being occupied by more than one AT (the aTc bound TetR isoform) protein, and a second weak cooperative interaction exists between the aTc-free TetR isoform (T). Consequently, this work highlights the power of the synthetic enhancer approach as a tool for studying protein-protein interactions via an experimentally verifiable prediction for the general mode of binding of the TetR repressor.
Asunto(s)
Simulación por Computador , Elementos de Facilitación Genéticos , Modelos Genéticos , Algoritmos , ADN Bacteriano/genética , Proteínas de Escherichia coli/genética , Regulación Bacteriana de la Expresión Génica , Unión Proteica , Isoformas de Proteínas/genética , Biología Sintética , Resistencia a la Tetraciclina/genética , TermodinámicaRESUMEN
A challenge of the synthetic biology approach is to use our understanding of a system to recreate a biological function with specific properties. We have applied this framework to bacterial enhancers, combining a driver, transcription factor binding sites, and a poised polymerase to create synthetic modular enhancers. Our findings suggest that enhancer-based transcriptional control depends critically and quantitatively on DNA looping, leading to complex regulatory effects when the enhancer cassettes contain additional transcription factor binding sites for TetR, a bacterial transcription factor. We show through a systematic interplay of experiment and thermodynamic modeling that the level of gene expression can be modulated to convert a variable inducer concentration input into discrete or step-like output expression levels. Finally, using a different DNA-binding protein (TraR), we show that the regulatory output is not a particular feature of the specific DNA-binding protein used for the enhancer but a general property of synthetic bacterial enhancers.
Asunto(s)
Elementos de Facilitación Genéticos , Biología Sintética/métodos , Bacterias/genética , ADN/química , Escherichia coli/genética , Regiones Promotoras Genéticas , Transcripción GenéticaRESUMEN
Holliday junctions form during DNA repair and homologous recombination processes. These processes entail branch migration, whereby the length of two arms of a cruciform increases at the expense of the two others. Branch migration is carried out in prokaryotic cells by the RuvAB motor complex. We study RuvAB-catalyzed branch migration by following the motion of a small paramagnetic bead tethered to a surface by two opposing arms of a single cruciform. The bead, pulled under the action of magnetic tweezers, exerts tension on the cruciform, which in turn transmits the force to a single RuvAB complex bound at the crossover point. This setup provides a unique means of measuring several kinetic parameters of interest such as the translocation rate, the processivity, and the force on the substrate against which the RuvAB complex cannot effect translocation. RuvAB-catalyzed branch migration proceeds with a small, discrete number of rates, supporting the view that the monomers comprising the RuvB hexameric rings are not functionally homogeneous and that dimers or trimers constitute the active subunits. The most frequently encountered rate, 98 +/- 3 bp/sec, is approximately five times faster than previously estimated. The apparent processivity of branch migration between pauses of inactivity is approximately 7,000 bp. Branch migration persists against opposing forces up to 23 pN.