RESUMO
The development of highly productive, genetically stable manufacturing cell lines is on the critical path to IND filing for protein-based biologic drugs. Here, we describe the Leap-In Transposase® platform, a novel transposon-based mammalian (e.g., Chinese hamster ovary) cell line development system that produces high-titer stable pools with productivity and product quality attributes that are highly comparable to clones that are subsequently derived therefrom. The productivity distributions of clones are strongly biased toward high producers, and genetic and expression stability is consistently high. By avoiding the poor integration rates, concatemer formation, detrimental transgene recombination, low average expression level, unpredictable product quality, and inconsistent genetic stability characteristic of nonhomologous recombination methods, Leap-In provides several opportunities to de-risk programs early and reduce timelines and resources.
Assuntos
Produtos Biológicos/metabolismo , Linhagem Celular , Elementos de DNA Transponíveis , Transgenes , Transposases , Animais , Bioengenharia , Células CHO , Células Clonais , Cricetulus , Humanos , Mamíferos , Camundongos , Regiões Promotoras GenéticasRESUMO
Mammalian cell line stability is an important consideration when establishing a biologics manufacturing process in the biopharmaceutical and in vitro diagnostics (IVD) industries. Traditional Chinese hamster ovary (CHO) cell line development methods use a random integration approach that requires transfection, selection, optional amplification, screenings, and single-cell cloning to select clones with acceptable productivity, product quality, and genetic stability. Site-specific integration reduces these disadvantages, and new technologies have been developed to mitigate risks associated with genetic instability. In this study, we applied the Leap-In® transposase-mediated expression system from ATUM to generate stable CHOK1 pools for the production of four recombinant antibody reagents for IVD immunoassays. CHO cell line stability is defined by consistent antibody production over time. Three of the CHOK1 pools maintained productivity suitable for manufacturing, with high antibody yields. The productivity of the remaining CHOK1 pool decreased over time; however, derivative clones showed acceptable stability. l-glutamine had variable effects on CHOK1 cell line or stable pool stability and significantly affected antibody product titer. Compared with traditional random integration methods, the ATUM Leap-In system can reduce the time needed to develop new immunoassays by using semi site-specific integration to generate high-yield stable pools that meet manufacturing stability requirements.
Assuntos
Cricetulus , Proteínas Recombinantes , Células CHO , Animais , Proteínas Recombinantes/genética , Proteínas Recombinantes/imunologia , Proteínas Recombinantes/biossíntese , Anticorpos Monoclonais/biossíntese , Anticorpos Monoclonais/imunologia , Anticorpos Monoclonais/genética , Cricetinae , Humanos , Transposases/genética , Transposases/metabolismoRESUMO
The DNA sequence used to encode a polypeptide can have dramatic effects on its expression. Lack of readily available tools has until recently inhibited meaningful experimental investigation of this phenomenon. Advances in synthetic biology and the application of modern engineering approaches now provide the tools for systematic analysis of the sequence variables affecting heterologous expression of recombinant proteins. We here discuss how these new tools are being applied and how they circumvent the constraints of previous approaches, highlighting some of the surprising and promising results emerging from the developing field of gene engineering.
Assuntos
Engenharia Genética/métodos , Proteínas Recombinantes/biossíntese , Proteínas Recombinantes/genética , Códon , Biblioteca Gênica , Vetores Genéticos , Humanos , Fases de Leitura Aberta , Biologia SintéticaRESUMO
SCHEMA structure-guided recombination of 3 fungal class II cellobiohydrolases (CBH II cellulases) has yielded a collection of highly thermostable CBH II chimeras. Twenty-three of 48 genes sampled from the 6,561 possible chimeric sequences were secreted by the Saccharomyces cerevisiae heterologous host in catalytically active form. Five of these chimeras have half-lives of thermal inactivation at 63 degrees C that are greater than the most stable parent, CBH II enzyme from the thermophilic fungus Humicola insolens, which suggests that this chimera collection contains hundreds of highly stable cellulases. Twenty-five new sequences were designed based on mathematical modeling of the thermostabilities for the first set of chimeras. Ten of these sequences were expressed in active form; all 10 retained more activity than H. insolens CBH II after incubation at 63 degrees C. The total of 15 validated thermostable CBH II enzymes have high sequence diversity, differing from their closest natural homologs at up to 63 amino acid positions. Selected purified thermostable chimeras hydrolyzed phosphoric acid swollen cellulose at temperatures 7 to 15 degrees C higher than the parent enzymes. These chimeras also hydrolyzed as much or more cellulose than the parent CBH II enzymes in long-time cellulose hydrolysis assays and had pH/activity profiles as broad, or broader than, the parent enzymes. Generating this group of diverse, thermostable fungal CBH II chimeras is the first step in building an inventory of stable cellulases from which optimized enzyme mixtures for biomass conversion can be formulated.
Assuntos
Celulases/genética , Engenharia de Proteínas/métodos , Recombinação Genética , Estabilidade Enzimática , Proteínas Fúngicas/genética , Temperatura Alta , Proteínas Recombinantes de Fusão , Saccharomyces cerevisiae/genéticaRESUMO
A quantitative linear model accurately (R(2) = 0.88) describes the thermostabilities of 54 characterized members of a family of fungal cellobiohydrolase class II (CBH II) cellulase chimeras made by SCHEMA recombination of three fungal enzymes, demonstrating that the contributions of SCHEMA sequence blocks to stability are predominantly additive. Thirty-one of 31 predicted thermostable CBH II chimeras have thermal inactivation temperatures higher than the most thermostable parent CBH II, from Humicola insolens, and the model predicts that hundreds more CBH II chimeras share this superior thermostability. Eight of eight thermostable chimeras assayed hydrolyze the solid cellulosic substrate Avicel at temperatures at least 5 degrees C above the most stable parent, and seven of these showed superior activity in 16-h Avicel hydrolysis assays. The sequence-stability model identified a single block of sequence that adds 8.5 degrees C to chimera thermostability. Mutating individual residues in this block identified the C313S substitution as responsible for the entire thermostabilizing effect. Introducing this mutation into the two recombination parent CBH IIs not featuring it (Hypocrea jecorina and H. insolens) decreased inactivation, increased maximum Avicel hydrolysis temperature, and improved long time hydrolysis performance. This mutation also stabilized and improved Avicel hydrolysis by Phanerochaete chrysosporium CBH II, which is only 55-56% identical to recombination parent CBH IIs. Furthermore, the C313S mutation increased total H. jecorina CBH II activity secreted by the Saccharomyces cerevisiae expression host more than 10-fold. Our results show that SCHEMA structure-guided recombination enables quantitative prediction of cellulase chimera thermostability and efficient identification of stabilizing mutations.
Assuntos
Celulose 1,4-beta-Celobiosidase/genética , Proteínas Fúngicas/genética , Mutação , Recombinação Genética , Sequência de Aminoácidos , Ascomicetos/enzimologia , Sítios de Ligação/genética , Celulose/química , Celulose/metabolismo , Celulose 1,4-beta-Celobiosidase/química , Celulose 1,4-beta-Celobiosidase/metabolismo , Biologia Computacional/métodos , Estabilidade Enzimática/genética , Proteínas Fúngicas/química , Proteínas Fúngicas/metabolismo , Concentração de Íons de Hidrogênio , Hidrólise , Hypocrea/enzimologia , Modelos Lineares , Modelos Moleculares , Dados de Sequência Molecular , Estrutura Terciária de Proteína , Homologia de Sequência de Aminoácidos , Especificidade da Espécie , Especificidade por Substrato , TemperaturaRESUMO
Omega-hydroxyfatty acids are excellent monomers for synthesizing a unique family of polyethylene-like biobased plastics. However, ω-hydroxyfatty acids are difficult and expensive to prepare by traditional organic synthesis, precluding their use in commodity materials. Here we report the engineering of a strain of the diploid yeast Candida tropicalis to produce commercially viable yields of ω-hydroxyfatty acids. To develop the strain we identified and eliminated 16 genes encoding 6 cytochrome P450s, 4 fatty alcohol oxidases, and 6 alcohol dehydrogenases from the C. tropicalis genome. We also show that fatty acids with different chain lengths and degrees of unsaturation can be more efficiently oxidized by expressing different P450s within this strain background. Biocatalysis using engineered C. tropicalis is thus a potentially attractive biocatalytic platform for producing commodity chemicals from renewable resources.
Assuntos
Candida tropicalis/metabolismo , Ácidos Graxos/biossíntese , Engenharia Genética/métodos , Óleos/metabolismo , Plásticos/química , Biotransformação , Candida tropicalis/enzimologia , Candida tropicalis/genética , Sistema Enzimático do Citocromo P-450/deficiência , Sistema Enzimático do Citocromo P-450/genética , Sistema Enzimático do Citocromo P-450/metabolismo , Ácidos Graxos/química , Fermentação , Deleção de Genes , Ácido Mirístico/química , Ácido Mirístico/metabolismo , Ácidos Mirísticos/química , Ácidos Mirísticos/metabolismo , OxirreduçãoRESUMO
The type III secretion system (T3SS) exports proteins from the cytoplasm, through both the inner and outer membranes, to the external environment. Here, a system is constructed to harness the T3SS encoded within Salmonella Pathogeneity Island 1 to export proteins of biotechnological interest. The system is composed of an operon containing the target protein fused to an N-terminal secretion tag and its cognate chaperone. Transcription is controlled by a genetic circuit that only turns on when the cell is actively secreting protein. The system is refined using a small human protein (DH domain) and demonstrated by exporting three silk monomers (ADF-1, -2, and -3), representative of different types of spider silk. Synthetic genes encoding silk monomers were designed to enhance genetic stability and codon usage, constructed by automated DNA synthesis, and cloned into the secretion control system. Secretion rates up to 1.8 mg l(-1) h(-1) are demonstrated with up to 14% of expressed protein secreted. This work introduces new parts to control protein secretion in Gram-negative bacteria, which will be broadly applicable to problems in biotechnology.
Assuntos
Fibroínas/metabolismo , Proteínas Recombinantes de Fusão/metabolismo , Salmonella/fisiologia , Sequência de Aminoácidos , Animais , Proteínas de Bactérias/genética , Proteínas de Bactérias/metabolismo , Fibroínas/genética , Humanos , Proteínas de Membrana/genética , Proteínas de Membrana/metabolismo , Modelos Biológicos , Dados de Sequência Molecular , Engenharia de Proteínas/métodos , Transporte Proteico , Proteínas Recombinantes de Fusão/biossíntese , Proteínas Recombinantes de Fusão/genética , Salmonella/genética , Salmonella/metabolismo , Alinhamento de Sequência , Transdução de Sinais , Aranhas/genéticaRESUMO
Clonally derived cell lines (CDCL) from Chinese Hamster Ovary (CHO) host cell lines, remain the most popular method to manufacture therapeutic proteins. However, CHO cell pools are increasingly being used as an alternate method to produce therapeutic proteins for preclinical drug development in an effort to shorten the time required for new drug development. It is essential that these CHO pools exhibit the desired attributes of CHO CDCLs such as high protein titers and consistent product quality attributes (PQAs). In this study the authors evaluated the Leap-In Transposase®, for the expression of four different proteins (three mAbs and one Bispecific mAb). The resultant pool titers ranges from 2.0 to 5.0 g L-1 for the four proteins compared to 1.5-3.3 g L-1 from the respective control pools (generated by random gene integration). The resultant cell pools are a homogeneously expressing cell population. The average gene copy numbers are similar or lower in the evaluation pools relative to the control pools. The higher titers in the evaluation pools are attributed to higher levels of both IgG-LC and IgG-HC mRNA. In conclusion, the Leap-In transposase generates high titer, homogeneous CHO pools in a short time-period without introducing any undesired PQAs.
Assuntos
Anticorpos Biespecíficos , Anticorpos Monoclonais , Técnicas de Cultura de Células , Transposases , Animais , Anticorpos Biespecíficos/biossíntese , Anticorpos Monoclonais/biossíntese , Células CHO , Cricetulus , PlasmídeosRESUMO
BACKGROUND: Altering a protein's function by changing its sequence allows natural proteins to be converted into useful molecular tools. Current protein engineering methods are limited by a lack of high throughput physical or computational tests that can accurately predict protein activity under conditions relevant to its final application. Here we describe a new synthetic biology approach to protein engineering that avoids these limitations by combining high throughput gene synthesis with machine learning-based design algorithms. RESULTS: We selected 24 amino acid substitutions to make in proteinase K from alignments of homologous sequences. We then designed and synthesized 59 specific proteinase K variants containing different combinations of the selected substitutions. The 59 variants were tested for their ability to hydrolyze a tetrapeptide substrate after the enzyme was first heated to 68 degrees C for 5 minutes. Sequence and activity data was analyzed using machine learning algorithms. This analysis was used to design a new set of variants predicted to have increased activity over the training set, that were then synthesized and tested. By performing two cycles of machine learning analysis and variant design we obtained 20-fold improved proteinase K variants while only testing a total of 95 variant enzymes. CONCLUSION: The number of protein variants that must be tested to obtain significant functional improvements determines the type of tests that can be performed. Protein engineers wishing to modify the property of a protein to shrink tumours or catalyze chemical reactions under industrial conditions have until now been forced to accept high throughput surrogate screens to measure protein properties that they hope will correlate with the functionalities that they intend to modify. By reducing the number of variants that must be tested to fewer than 100, machine learning algorithms make it possible to use more complex and expensive tests so that only protein properties that are directly relevant to the desired application need to be measured. Protein design algorithms that only require the testing of a small number of variants represent a significant step towards a generic, resource-optimized protein engineering process.
Assuntos
Inteligência Artificial , Desenho de Fármacos , Endopeptidase K/química , Endopeptidase K/metabolismo , Escherichia coli/metabolismo , Mutagênese Sítio-Dirigida/métodos , Análise de Sequência de Proteína/métodos , Algoritmos , Sequência de Aminoácidos , Endopeptidase K/genética , Escherichia coli/genética , Genes Sintéticos/genética , Dados de Sequência Molecular , Mutação , Engenharia de Proteínas/métodos , Proteínas Recombinantes/química , Proteínas Recombinantes/metabolismo , Relação Estrutura-AtividadeRESUMO
We describe synthetic shuffling, an evolutionary protein engineering technology in which every amino acid from a set of parents is allowed to recombine independently of every other amino acid. With the use of degenerate oligonucleotides, synthetic shuffling provides a direct route from database sequence information to functional libraries. Physical starting genes are unnecessary, and additional design criteria such as optimal codon usage or known beneficial mutations can also be incorporated. We performed synthetic shuffling of 15 subtilisin genes and obtained active and highly chimeric enzymes with desirable combinations of properties that we did not obtain by other directed-evolution methods.
Assuntos
Aminoácidos/genética , Técnicas de Química Combinatória/métodos , Embaralhamento de DNA/métodos , Engenharia de Proteínas/métodos , Proteínas Recombinantes/genética , Sequência de Aminoácidos , Aminoácidos/química , Bacillus subtilis/enzimologia , Bacillus subtilis/genética , Concentração de Íons de Hidrogênio , Dados de Sequência Molecular , Biblioteca de Peptídeos , Alinhamento de Sequência/métodos , Análise de Sequência de Proteína/métodosRESUMO
BACKGROUND: Direct synthesis of genes is rapidly becoming the most efficient way to make functional genetic constructs and enables applications such as codon optimization, RNAi resistant genes and protein engineering. Here we introduce a software tool that drastically facilitates the design of synthetic genes. RESULTS: Gene Designer is a stand-alone software for fast and easy design of synthetic DNA segments. Users can easily add, edit and combine genetic elements such as promoters, open reading frames and tags through an intuitive drag-and-drop graphic interface and a hierarchical DNA/Protein object map. Using advanced optimization algorithms, open reading frames within the DNA construct can readily be codon optimized for protein expression in any host organism. Gene Designer also includes features such as a real-time sliding calculator of oligonucleotide annealing temperatures, sequencing primer generator, tools for avoidance or inclusion of restriction sites, and options to maximize or minimize sequence identity to a reference. CONCLUSION: Gene Designer is an expandable Synthetic Biology workbench suitable for molecular biologists interested in the de novo creation of genetic constructs.
Assuntos
DNA/química , DNA/genética , Genes Sintéticos/genética , Engenharia Genética/métodos , Análise de Sequência de DNA/métodos , Software , Biologia de Sistemas/métodos , Sequência de Bases , Desenho Assistido por Computador , Desenho de Fármacos , Dados de Sequência Molecular , Interface Usuário-ComputadorRESUMO
There are two main reasons to try to predict an enzyme's function from its sequence. The first is to identify the components and thus the functional capabilities of an organism, the second is to create enzymes with specific properties. Genomics, expression analysis, proteomics and metabonomics are largely directed towards understanding how information flows from DNA sequence to protein functions within an organism. This review focuses on information flow in the opposite direction: the applicability of what is being learned from natural enzymes to improve methods for catalyst design.
Assuntos
Enzimas/química , Engenharia de Proteínas , Análise de Sequência de Proteína , Animais , Humanos , Homologia Estrutural de ProteínaRESUMO
During protein evolution, amino acids change due to a combination of functional constraints and genetic drift. Proteins frequently contain pairs of amino acids that appear to change together (covariation). Analysis of covariation from naturally occurring sets of orthologs cannot distinguish between residue pairs retained by functional requirements of the protein and those pairs existing due to changes along a common evolutionary path. Here, we have separated the two types of covariation by independently recombining every naturally occurring amino acid variant within a set of 15 subtilisin orthologs. Our analysis shows that in this family of subtilisin orthologs, almost all possible pairwise combinations of amino acids can coexist. This suggests that amino acid covariation found in the subtilisin orthologs is almost entirely due to common ancestral origin of the changes rather than functional constraints. We conclude that naturally occurring sequence diversity can be used to identify positions that can vary independently without destroying protein function.
Assuntos
Substituição de Aminoácidos , Evolução Molecular , Subtilisinas/genética , Sequência de Aminoácidos , Bacillus/enzimologia , Bacillus/genética , Sítios de Ligação , Evolução Molecular Direcionada , Variação Genética , Modelos Moleculares , Filogenia , Conformação Proteica , Subtilisinas/química , Subtilisinas/fisiologia , TermodinâmicaRESUMO
Complex multivariate engineering problems are commonplace and not unique to protein engineering. Mathematical and data-mining tools developed in other fields of engineering have now been applied to analyze sequence-activity relationships of peptides and proteins and to assist in the design of proteins and peptides with specified properties. Decreasing costs of DNA sequencing in conjunction with methods to quickly synthesize statistically representative sets of proteins allow modern heuristic statistics to be applied to protein engineering. This provides an alternative approach to expensive assays or unreliable high-throughput surrogate screens.
Assuntos
Biologia Computacional/métodos , Enzimas/química , Engenharia de Proteínas/tendências , Algoritmos , Sequência de Aminoácidos , Substituição de Aminoácidos , Catálise , Desenho Assistido por Computador , Desenho de Fármacos , Enzimas/genética , Enzimas/metabolismo , Cinética , Modelos Estatísticos , Modelos Teóricos , Mutagênese , Redes Neurais de ComputaçãoRESUMO
We have used design of experiments (DOE) and systematic variance to efficiently explore glutathione transferase substrate specificities caused by amino acid substitutions. Amino acid substitutions selected using phylogenetic analysis were synthetically combined using a DOE design to create an information-rich set of gene variants, termed infologs. We used machine learning to identify and quantify protein sequence-function relationships against 14 different substrates. The resulting models were quantitative and predictive, serving as a guide for engineering of glutathione transferase activity toward a diverse set of herbicides. Predictive quantitative models like those presented here have broad applicability for bioengineering.
Assuntos
Substituição de Aminoácidos/genética , Glutationa Transferase/química , Resistência a Herbicidas/genética , Proteínas de Plantas/química , Biologia Sintética/métodos , Triticum/genética , Sequência de Aminoácidos , Glutationa Transferase/genética , Glutationa Transferase/metabolismo , Aprendizado de Máquina , Dados de Sequência Molecular , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismo , Projetos de Pesquisa , Análise de Sequência de ProteínaRESUMO
Lentiviral vectors are useful experimental tools for stable gene delivery and have been used to treat human inherited genetic disorders and hematologic malignancies with promising results. Because some of the lentiviral vector components are cytotoxic, transient plasmid transfection has been used to produce the large batches needed for clinical trials. However, this method is costly, poorly reproducible and hard to scale up. Here we describe a general method for construction of stable packaging cell lines that continuously produce lentiviral vectors. This uses Cre recombinase-mediated cassette exchange to insert a codon-optimised HIV-1 Gag-Pol expression construct in a continuously expressed locus in 293FT cells. Subsequently Rev, envelope and vector genome expression cassettes are serially transfected. Vector titers in excess of 10(6) transducing units/ml can be harvested from the final producer clones, which can be increased to 10(8)â TU/ml by concentration. This method will be of use to all basic and clinical investigators who wish to produce large batches of lentiviral vectors.
Assuntos
Vetores Genéticos/genética , Células HEK293 , Lentivirus/genética , Montagem de Vírus , Expressão Gênica , HIV-1/genética , HIV-1/metabolismo , Recombinação Homóloga , Humanos , Precursores de Proteínas/genética , Precursores de Proteínas/metabolismo , Retroviridae/genética , Retroviridae/metabolismo , Proteínas do Envelope Viral/genética , Proteínas do Envelope Viral/metabolismo , Produtos do Gene gag do Vírus da Imunodeficiência Humana/genética , Produtos do Gene gag do Vírus da Imunodeficiência Humana/metabolismo , Produtos do Gene pol do Vírus da Imunodeficiência Humana/genética , Produtos do Gene pol do Vírus da Imunodeficiência Humana/metabolismo , Produtos do Gene rev do Vírus da Imunodeficiência Humana/genética , Produtos do Gene rev do Vírus da Imunodeficiência Humana/metabolismoRESUMO
The expression of functional proteins in heterologous hosts is a cornerstone of modern biotechnology. Unfortunately, proteins are often difficult to express outside their original context. They might contain codons that are rarely used in the desired host, come from organisms that use non-canonical code or contain expression-limiting regulatory elements within their coding sequence. Improvements in the speed and cost of gene synthesis have facilitated the complete redesign of entire gene sequences to maximize the likelihood of high protein expression. Redesign strategies are discussed here, including modification of translation initiation regions, alteration of mRNA structural elements and use of different codon biases.
Assuntos
Códon/genética , Regulação da Expressão Gênica/genética , Engenharia de Proteínas/métodos , Proteínas/genética , Proteínas Recombinantes/biossíntese , Proteínas Recombinantes/genética , Análise de Sequência de DNA/métodos , Clonagem Molecular/métodosRESUMO
Several protein engineering approaches were combined to optimize the selectivity and activity of Vibrio fluvialis aminotransferase (Vfat) for the synthesis of (3S,5R)-ethyl 3-amino-5-methyloctanoate; a key intermediate in the synthesis of imagabalin, an advanced candidate for the treatment of generalized anxiety disorder. Starting from wild-type Vfat, which had extremely low activity catalyzing the desired reaction, we engineered an improved enzyme with a 60-fold increase in initial reaction velocity for transamination of (R)-ethyl 5-methyl 3-oxooctanoate to (3S,5R)-ethyl 3-amino-5-methyloctanoate. To achieve this, <450 variants were screened, which allowed accurate assessment of enzyme performance using a low-throughput ultra performance liquid chromatography assay. During the course of this work, crystal structures of Vfat wild type and an improved variant (Vfat variant r414) were solved and they are reported here for the first time. This work also provides insight into the critical residues for substrate specificity for the transamination of (R)-ethyl 5-methyl 3-oxooctanoate and structurally related ß-ketoesters.
Assuntos
Aminoácidos/metabolismo , Caprilatos/metabolismo , Engenharia de Proteínas/métodos , Transaminases/genética , Transaminases/metabolismo , Vibrio/enzimologia , Cinética , Modelos Moleculares , Mutação , Conformação Proteica , Homologia de Sequência de Aminoácidos , Especificidade por Substrato , Transaminases/químicaRESUMO
The promise of synthetic biology lies in the creation of novel function from the proper combination of genetic elements. De novo gene synthesis has become a cost-effective method for building virtually any conceptualized genetic construct, removing the constraints of extant sequences, and greatly facilitating study of the relationships between gene sequence and function. With the rapid increase in the number and variety of characterized and cataloged genetic elements, tools that facilitate assembly of such parts into functional constructs (genes, vectors, circuits, etc.) are essential. The Gene Designer software allows scientists and engineers to readily manage and recombine genetic elements into novel assemblies. It also provides tools for the simulation of molecular cloning schemes as well as the engineering and optimization of protein-coding sequences. Together, the functions in Gene Designer provide a complete capability to design functional genetic constructs.
Assuntos
Biologia Computacional/métodos , DNA/genética , Genes Sintéticos/genética , Biologia Sintética/métodos , Sequência de Bases , Clonagem Molecular , Biossíntese de Proteínas , SoftwareRESUMO
DNA sequences are now far more readily available in silico than as physical DNA. De novo gene synthesis is an increasingly cost-effective method for building genetic constructs, and effectively removes the constraint of basing constructs on extant sequences. This allows scientists and engineers to experimentally test their hypotheses relating sequence to function. Molecular biologists, and now synthetic biologists, are characterizing and cataloging genetic elements with specific functions, aiming to combine them to perform complex functions. However, the most common purpose of synthetic genes is for the expression of an encoded protein. The huge number of different proteins makes it impossible to characterize and catalog each functional gene. Instead, it is necessary to abstract design principles from experimental data: data that can be generated by making predictions followed by synthesizing sequences to test those predictions. Because of the degeneracy of the genetic code, design of gene sequences to encode proteins is a high-dimensional problem, so there is no single simple formula to guarantee success. Nevertheless, there are several straightforward steps that can be taken to greatly increase the probability that a designed sequence will result in expression of the encoded protein. In this chapter, we discuss gene sequence parameters that are important for protein expression. We also describe algorithms for optimizing these parameters, and troubleshooting procedures that can be helpful when initial attempts fail. Finally, we show how many of these methods can be accomplished using the synthetic biology software tool Gene Designer.