Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 28
Filter
1.
Cell ; 158(6): 1431-1443, 2014 Sep 11.
Article in English | MEDLINE | ID: mdl-25215497

ABSTRACT

Transcription factor (TF) DNA sequence preferences direct their regulatory activity, but are currently known for only ∼1% of eukaryotic TFs. Broadly sampling DNA-binding domain (DBD) types from multiple eukaryotic clades, we determined DNA sequence preferences for >1,000 TFs encompassing 54 different DBD classes from 131 diverse eukaryotes. We find that closely related DBDs almost always have very similar DNA sequence preferences, enabling inference of motifs for ∼34% of the ∼170,000 known or predicted eukaryotic TFs. Sequences matching both measured and inferred motifs are enriched in chromatin immunoprecipitation sequencing (ChIP-seq) peaks and upstream of transcription start sites in diverse eukaryotic lineages. SNPs defining expression quantitative trait loci in Arabidopsis promoters are also enriched for predicted TF binding sites. Importantly, our motif "library" can be used to identify specific TFs whose binding may be altered by human disease risk alleles. These data present a powerful resource for mapping transcriptional networks across eukaryotes.


Subject(s)
Arabidopsis/genetics , Nucleotide Motifs , Sequence Analysis, DNA , Transcription Factors/metabolism , Arabidopsis/metabolism , Chromatin Immunoprecipitation , Humans , Polymorphism, Single Nucleotide , Promoter Regions, Genetic , Protein Binding , Quantitative Trait Loci
2.
Nature ; 451(7179): 704-7, 2008 Feb 07.
Article in English | MEDLINE | ID: mdl-18256669

ABSTRACT

Biosignatures and structures in the geological record indicate that microbial life has inhabited Earth for the past 3.5 billion years or so. Research in the physical sciences has been able to generate statements about the ancient environment that hosted this life. These include the chemical compositions and temperatures of the early ocean and atmosphere. Only recently have the natural sciences been able to provide experimental results describing the environments of ancient life. Our previous work with resurrected proteins indicated that ancient life lived in a hot environment. Here we expand the timescale of resurrected proteins to provide a palaeotemperature trend of the environments that hosted life from 3.5 to 0.5 billion years ago. The thermostability of more than 25 phylogenetically dispersed ancestral elongation factors suggest that the environment supporting ancient life cooled progressively by 30 degrees C during that period. Here we show that our results are robust to potential statistical bias associated with the posterior distribution of inferred character states, phylogenetic ambiguity, and uncertainties in the amino-acid equilibrium frequencies used by evolutionary models. Our results are further supported by a nearly identical cooling trend for the ancient ocean as inferred from the deposition of oxygen isotopes. The convergence of results from natural and physical sciences suggest that ancient life has continually adapted to changes in environmental temperatures throughout its evolutionary history.


Subject(s)
Bacteria/metabolism , Bacterial Proteins/chemistry , Biological Evolution , Seawater/microbiology , Temperature , Adaptation, Physiological , Bacteria/classification , Bacterial Proteins/analysis , Enzyme Stability , History, Ancient , Hot Temperature , Peptide Elongation Factor Tu/analysis , Peptide Elongation Factor Tu/chemistry , Phylogeny , Time Factors , Uncertainty
3.
Cancers (Basel) ; 16(4)2024 Feb 12.
Article in English | MEDLINE | ID: mdl-38398153

ABSTRACT

Protein engineering can be used to tailor enzymes for medical purposes, including antibody-directed enzyme prodrug therapy (ADEPT), which can act as a tumor-targeted alternative to conventional chemotherapy for cancer. In ADEPT, the antibody serves as a vector, delivering a drug-activating enzyme selectively to the tumor site. Glutathione transferases (GSTs) are a family of naturally occurring detoxication enzymes, and the finding that some of them are overexpressed in tumors has been exploited to develop GST-activated prodrugs. The prodrug Telcyta is activated by GST P1-1, which is the GST most commonly elevated in cancer cells, implying that tumors overexpressing GST P1-1 should be particularly vulnerable to Telcyta. Promising antitumor activity has been noted in clinical trials, but the wildtype enzyme has modest activity with Telcyta, and further functional improvement would enhance its usefulness for ADEPT. We utilized protein engineering to construct human GST P1-1 gene variants in the search for enzymes with enhanced activity with Telcyta. The variant Y109H displayed a 2.9-fold higher enzyme activity compared to the wild-type GST P1-1. However, increased catalytic potency was accompanied by decreased thermal stability of the Y109H enzyme, losing 99% of its activity in 8 min at 50 °C. Thermal stability was restored by four additional mutations simultaneously introduced without loss of the enhanced activity with Telcyta. The mutation Q85R was identified as an important contributor to the regained thermostability. These results represent a first step towards a functional ADEPT application for Telcyta.

4.
J Am Chem Soc ; 135(38): 14425-32, 2013 Sep 25.
Article in English | MEDLINE | ID: mdl-23987134

ABSTRACT

Members of the old yellow enzyme (OYE) family are widely used, effective biocatalysts for the stereoselective trans-hydrogenation of activated alkenes. To further expand their substrate scope and improve catalytic performance, we have applied a protein engineering strategy called circular permutation (CP) to enhance the function of OYE1 from Saccharomyces pastorianus. CP can influence a biocatalyst's function by altering protein backbone flexibility and active site accessibility, both critical performance features because the catalytic cycle for OYE1 is thought to involve rate-limiting conformational changes. To explore the impact of CP throughout the OYE1 protein sequence, we implemented a highly efficient approach for cell-free cpOYE library preparation by combining whole-gene synthesis with in vitro transcription/translation. The versatility of such an ex vivo system was further demonstrated by the rapid and reliable functional evaluation of library members under variable environmental conditions with three reference substrates ketoisophorone, cinnamaldehyde, and (S)-carvone. Library analysis identified over 70 functional OYE1 variants with several biocatalysts exhibiting over an order of magnitude improved catalytic activity. Although catalytic gains of individual cpOYE library members vary by substrate, the locations of new protein termini in functional variants for all tested substates fall within the same four distinct loop/lid regions near the active site. Our findings demonstrate the importance of these structural elements in enzyme function and support the hypothesis of conformational flexibility as a limiting factor for catalysis in wild type OYE.


Subject(s)
Bacterial Proteins/chemistry , NADPH Dehydrogenase/chemistry , Acrolein/analogs & derivatives , Acrolein/chemistry , Bacterial Proteins/genetics , Biocatalysis , Catalytic Domain , Cyclohexane Monoterpenes , Cyclohexanones/chemistry , Kinetics , Models, Molecular , Monoterpenes/chemistry , NADPH Dehydrogenase/genetics , Protein Conformation , Protein Engineering , Recombinant Proteins/chemistry , Recombinant Proteins/genetics , Saccharomyces/enzymology , Stereoisomerism
5.
Proc Natl Acad Sci U S A ; 107(5): 1948-53, 2010 Feb 02.
Article in English | MEDLINE | ID: mdl-20080675

ABSTRACT

Any system, natural or human-made, is better understood if we analyze both its history and its structure. Here we combine structural analyses with a "Reconstructed Evolutionary Adaptive Path" (REAP) analysis that used the evolutionary and functional history of DNA polymerases to replace amino acids to enable polymerases to accept a new class of triphosphate substrates, those having their 3'-OH ends blocked as a 3(')-ONH(2) group (dNTP-ONH(2)). Analogous to widely used 2',3'-dideoxynucleoside triphosphates (ddNTPs), dNTP-ONH(2)s terminate primer extension. Unlike ddNTPs, however, primer extension can be resumed by cleaving an O-N bond to restore an -OH group to the 3'-end of the primer. REAP combined with crystallographic analyses identified 35 sites where replacements might improve the ability of Taq to accept dNTP-ONH(2)s. A library of 93 Taq variants, each having replacements at three or four of these sites, held eight variants having improved ability to accept dNTP-ONH(2) substrates. Two of these (A597T, L616A, F667Y, E745H, and E520G, K540I, L616A) performed notably well. The second variant incorporated both dNTP-ONH(2)sand ddNTPs faithfully and efficiently, supporting extension-cleavage-extension cycles applicable in parallel sequencing and in SNP detection through competition between reversible and irreversible terminators. Dissecting these results showed that one replacement (L616A), not previously identified, allows Taq to incorporate both reversible and irreversible terminators. Modeling showed how L616A might open space behind Phe-667, allowing it to move to accommodate the larger 3'-substituent. This work provides polymerases for DNA analyses and shows how evolutionary analyses help explore relationships between structure and function in proteins.


Subject(s)
Taq Polymerase/genetics , Taq Polymerase/metabolism , Amino Acid Substitution , Base Sequence , Catalytic Domain/genetics , DNA Primers/genetics , Evolution, Molecular , Genetic Variation , Models, Molecular , Polymorphism, Single Nucleotide , Sequence Analysis, DNA , Substrate Specificity , Taq Polymerase/chemistry
6.
Langmuir ; 28(25): 9878-84, 2012 Jun 26.
Article in English | MEDLINE | ID: mdl-22616757

ABSTRACT

Antibodies were patterned onto flexible plastic films using the flexographic printing process. An ink formulation was developed using high molecular weight polyvinyl alcohol in carbonate-bicarbonate buffer. In order to aid both antibody adhesion and the quality of definition in the printed features, a nitrocellulose coating was developed that was capable of being discretely patterned, thus increasing the signal-to-noise ratio of an antibody array. Printing antibody features such as dots, squares, text, and fine lines were reproduced effectively. Furthermore, this process could be easily adapted for printing of other biological materials, including, but not limited to, enzymes, DNA, proteins, aptamers, and cells.


Subject(s)
Antibodies, Immobilized/chemistry , Printing/methods , Animals , Antibodies, Immobilized/metabolism , Collodion/chemistry , Coloring Agents/chemistry , Peroxidase/metabolism , Rheology
7.
Protein Expr Purif ; 83(1): 37-46, 2012 May.
Article in English | MEDLINE | ID: mdl-22425659

ABSTRACT

The DNA sequence used to encode a polypeptide can have dramatic effects on its expression. Lack of readily available tools has until recently inhibited meaningful experimental investigation of this phenomenon. Advances in synthetic biology and the application of modern engineering approaches now provide the tools for systematic analysis of the sequence variables affecting heterologous expression of recombinant proteins. We here discuss how these new tools are being applied and how they circumvent the constraints of previous approaches, highlighting some of the surprising and promising results emerging from the developing field of gene engineering.


Subject(s)
Genetic Engineering/methods , Recombinant Proteins/biosynthesis , Recombinant Proteins/genetics , Codon , Gene Library , Genetic Vectors , Humans , Open Reading Frames , Synthetic Biology
8.
Proc Natl Acad Sci U S A ; 106(14): 5610-5, 2009 Apr 07.
Article in English | MEDLINE | ID: mdl-19307582

ABSTRACT

SCHEMA structure-guided recombination of 3 fungal class II cellobiohydrolases (CBH II cellulases) has yielded a collection of highly thermostable CBH II chimeras. Twenty-three of 48 genes sampled from the 6,561 possible chimeric sequences were secreted by the Saccharomyces cerevisiae heterologous host in catalytically active form. Five of these chimeras have half-lives of thermal inactivation at 63 degrees C that are greater than the most stable parent, CBH II enzyme from the thermophilic fungus Humicola insolens, which suggests that this chimera collection contains hundreds of highly stable cellulases. Twenty-five new sequences were designed based on mathematical modeling of the thermostabilities for the first set of chimeras. Ten of these sequences were expressed in active form; all 10 retained more activity than H. insolens CBH II after incubation at 63 degrees C. The total of 15 validated thermostable CBH II enzymes have high sequence diversity, differing from their closest natural homologs at up to 63 amino acid positions. Selected purified thermostable chimeras hydrolyzed phosphoric acid swollen cellulose at temperatures 7 to 15 degrees C higher than the parent enzymes. These chimeras also hydrolyzed as much or more cellulose than the parent CBH II enzymes in long-time cellulose hydrolysis assays and had pH/activity profiles as broad, or broader than, the parent enzymes. Generating this group of diverse, thermostable fungal CBH II chimeras is the first step in building an inventory of stable cellulases from which optimized enzyme mixtures for biomass conversion can be formulated.


Subject(s)
Cellulases/genetics , Protein Engineering/methods , Recombination, Genetic , Enzyme Stability , Fungal Proteins/genetics , Hot Temperature , Recombinant Fusion Proteins , Saccharomyces cerevisiae/genetics
9.
J Biol Chem ; 284(39): 26229-33, 2009 Sep 25.
Article in English | MEDLINE | ID: mdl-19625252

ABSTRACT

A quantitative linear model accurately (R(2) = 0.88) describes the thermostabilities of 54 characterized members of a family of fungal cellobiohydrolase class II (CBH II) cellulase chimeras made by SCHEMA recombination of three fungal enzymes, demonstrating that the contributions of SCHEMA sequence blocks to stability are predominantly additive. Thirty-one of 31 predicted thermostable CBH II chimeras have thermal inactivation temperatures higher than the most thermostable parent CBH II, from Humicola insolens, and the model predicts that hundreds more CBH II chimeras share this superior thermostability. Eight of eight thermostable chimeras assayed hydrolyze the solid cellulosic substrate Avicel at temperatures at least 5 degrees C above the most stable parent, and seven of these showed superior activity in 16-h Avicel hydrolysis assays. The sequence-stability model identified a single block of sequence that adds 8.5 degrees C to chimera thermostability. Mutating individual residues in this block identified the C313S substitution as responsible for the entire thermostabilizing effect. Introducing this mutation into the two recombination parent CBH IIs not featuring it (Hypocrea jecorina and H. insolens) decreased inactivation, increased maximum Avicel hydrolysis temperature, and improved long time hydrolysis performance. This mutation also stabilized and improved Avicel hydrolysis by Phanerochaete chrysosporium CBH II, which is only 55-56% identical to recombination parent CBH IIs. Furthermore, the C313S mutation increased total H. jecorina CBH II activity secreted by the Saccharomyces cerevisiae expression host more than 10-fold. Our results show that SCHEMA structure-guided recombination enables quantitative prediction of cellulase chimera thermostability and efficient identification of stabilizing mutations.


Subject(s)
Cellulose 1,4-beta-Cellobiosidase/genetics , Fungal Proteins/genetics , Mutation , Recombination, Genetic , Amino Acid Sequence , Ascomycota/enzymology , Binding Sites/genetics , Cellulose/chemistry , Cellulose/metabolism , Cellulose 1,4-beta-Cellobiosidase/chemistry , Cellulose 1,4-beta-Cellobiosidase/metabolism , Computational Biology/methods , Enzyme Stability/genetics , Fungal Proteins/chemistry , Fungal Proteins/metabolism , Hydrogen-Ion Concentration , Hydrolysis , Hypocrea/enzymology , Linear Models , Models, Molecular , Molecular Sequence Data , Protein Structure, Tertiary , Sequence Homology, Amino Acid , Species Specificity , Substrate Specificity , Temperature
10.
Mol Syst Biol ; 5: 309, 2009.
Article in English | MEDLINE | ID: mdl-19756048

ABSTRACT

The type III secretion system (T3SS) exports proteins from the cytoplasm, through both the inner and outer membranes, to the external environment. Here, a system is constructed to harness the T3SS encoded within Salmonella Pathogeneity Island 1 to export proteins of biotechnological interest. The system is composed of an operon containing the target protein fused to an N-terminal secretion tag and its cognate chaperone. Transcription is controlled by a genetic circuit that only turns on when the cell is actively secreting protein. The system is refined using a small human protein (DH domain) and demonstrated by exporting three silk monomers (ADF-1, -2, and -3), representative of different types of spider silk. Synthetic genes encoding silk monomers were designed to enhance genetic stability and codon usage, constructed by automated DNA synthesis, and cloned into the secretion control system. Secretion rates up to 1.8 mg l(-1) h(-1) are demonstrated with up to 14% of expressed protein secreted. This work introduces new parts to control protein secretion in Gram-negative bacteria, which will be broadly applicable to problems in biotechnology.


Subject(s)
Fibroins/metabolism , Recombinant Fusion Proteins/metabolism , Salmonella/physiology , Amino Acid Sequence , Animals , Bacterial Proteins/genetics , Bacterial Proteins/metabolism , Fibroins/genetics , Humans , Membrane Proteins/genetics , Membrane Proteins/metabolism , Models, Biological , Molecular Sequence Data , Protein Engineering/methods , Protein Transport , Recombinant Fusion Proteins/biosynthesis , Recombinant Fusion Proteins/genetics , Salmonella/genetics , Salmonella/metabolism , Sequence Alignment , Signal Transduction , Spiders/genetics
11.
ACS Synth Biol ; 7(7): 1730-1741, 2018 07 20.
Article in English | MEDLINE | ID: mdl-29782150

ABSTRACT

Directed evolution experiments designed to improve the activity of a biocatalyst have increased in sophistication from the early days of completely random mutagenesis. Sequence-based and structure-based methods have been developed to identify "hotspot" positions that when randomized provide a higher frequency of beneficial mutations that improve activity. These focused mutagenesis methods reduce library sizes and therefore reduce screening burden, accelerating the rate of finding improved enzymes. Looking for further acceleration in finding improved enzymes, we investigated whether two existing methods, one sequence-based (Protein GPS) and one structure-based (using Bioluminate and MOE), were sufficiently predictive to provide not just the hotspot position, but also the amino acid substitution that improved activity at that position. By limiting the libraries to variants that contained only specific amino acid substitutions, library sizes were kept to less than 100 variants. For an initial round of ATA-117 R-selective transaminase evolution, we found that the methods used produced libraries where 9% and 18% of the amino acid substitutions chosen were amino acids that improved reaction performance in lysates. The ability to create combinations of mutations as part of the initial design was confounded by the relatively large number of predicted mutations that were inactivating (30% and 45% for the sequence-based and structure-based methods, respectively). Despite this, combining several mutations identified within a given method produced variant lysates 7- and 9-fold more active than the wild-type lysate, highlighting the capability of mutations chosen this way to generate large advances in activity in addition to the reductions in screening.


Subject(s)
Directed Molecular Evolution , Mutagenesis/genetics , Mutation/genetics
12.
BMC Biotechnol ; 7: 16, 2007 Mar 26.
Article in English | MEDLINE | ID: mdl-17386103

ABSTRACT

BACKGROUND: Altering a protein's function by changing its sequence allows natural proteins to be converted into useful molecular tools. Current protein engineering methods are limited by a lack of high throughput physical or computational tests that can accurately predict protein activity under conditions relevant to its final application. Here we describe a new synthetic biology approach to protein engineering that avoids these limitations by combining high throughput gene synthesis with machine learning-based design algorithms. RESULTS: We selected 24 amino acid substitutions to make in proteinase K from alignments of homologous sequences. We then designed and synthesized 59 specific proteinase K variants containing different combinations of the selected substitutions. The 59 variants were tested for their ability to hydrolyze a tetrapeptide substrate after the enzyme was first heated to 68 degrees C for 5 minutes. Sequence and activity data was analyzed using machine learning algorithms. This analysis was used to design a new set of variants predicted to have increased activity over the training set, that were then synthesized and tested. By performing two cycles of machine learning analysis and variant design we obtained 20-fold improved proteinase K variants while only testing a total of 95 variant enzymes. CONCLUSION: The number of protein variants that must be tested to obtain significant functional improvements determines the type of tests that can be performed. Protein engineers wishing to modify the property of a protein to shrink tumours or catalyze chemical reactions under industrial conditions have until now been forced to accept high throughput surrogate screens to measure protein properties that they hope will correlate with the functionalities that they intend to modify. By reducing the number of variants that must be tested to fewer than 100, machine learning algorithms make it possible to use more complex and expensive tests so that only protein properties that are directly relevant to the desired application need to be measured. Protein design algorithms that only require the testing of a small number of variants represent a significant step towards a generic, resource-optimized protein engineering process.


Subject(s)
Artificial Intelligence , Drug Design , Endopeptidase K/chemistry , Endopeptidase K/metabolism , Escherichia coli/metabolism , Mutagenesis, Site-Directed/methods , Sequence Analysis, Protein/methods , Algorithms , Amino Acid Sequence , Endopeptidase K/genetics , Escherichia coli/genetics , Genes, Synthetic/genetics , Molecular Sequence Data , Mutation , Protein Engineering/methods , Recombinant Proteins/chemistry , Recombinant Proteins/metabolism , Structure-Activity Relationship
13.
Nat Biotechnol ; 20(12): 1251-5, 2002 Dec.
Article in English | MEDLINE | ID: mdl-12426575

ABSTRACT

We describe synthetic shuffling, an evolutionary protein engineering technology in which every amino acid from a set of parents is allowed to recombine independently of every other amino acid. With the use of degenerate oligonucleotides, synthetic shuffling provides a direct route from database sequence information to functional libraries. Physical starting genes are unnecessary, and additional design criteria such as optimal codon usage or known beneficial mutations can also be incorporated. We performed synthetic shuffling of 15 subtilisin genes and obtained active and highly chimeric enzymes with desirable combinations of properties that we did not obtain by other directed-evolution methods.


Subject(s)
Amino Acids/genetics , Combinatorial Chemistry Techniques/methods , DNA Shuffling/methods , Protein Engineering/methods , Recombinant Proteins/genetics , Amino Acid Sequence , Amino Acids/chemistry , Bacillus subtilis/enzymology , Bacillus subtilis/genetics , Hydrogen-Ion Concentration , Molecular Sequence Data , Peptide Library , Sequence Alignment/methods , Sequence Analysis, Protein/methods
14.
Protein Eng Des Sel ; 30(8): 543-549, 2017 08 01.
Article in English | MEDLINE | ID: mdl-28967959

ABSTRACT

Exploring the vicinity around a locus of a protein in sequence space may identify homologs with enhanced properties, which could become valuable in biotechnical and other applications. A rational approach to this pursuit is the use of 'infologs', i.e. synthetic sequences with specific substitutions capturing maximal sequence information derived from the evolutionary history of the protein family. Ninety-five such infolog genes of poplar glutathione transferase were synthesized and expressed in Escherichia coli, and the catalytic activities of the proteins determined with alternative substrates. Sequence-activity relationships derived from the infologs were used to design a second set of 47 infologs in which 90% of the members exceeded wild-type properties. Two mutants, C2 (V55I/E95D/D108E/A160V) and G5 (F13L/C70A/G122E), were further functionally characterized. The activities of the infologs with the alternative substrates 1-chloro-2,4-dinitrobenzene and phenethyl isothiocyanate, subject to different chemistries, were positively correlated, indicating that the examined mutations were affecting the overall catalytic competence without major shift in substrate discrimination. By contrast, the enhanced protein expressivity observed in many of the mutants were not similarly correlated with the activities. In conclusion, small libraries of well-defined infologs can be used to systematically explore sequence space to optimize proteins in multidimensional functional space.


Subject(s)
Directed Molecular Evolution/methods , Glutathione Transferase/genetics , Plant Proteins/genetics , Populus/genetics , Recombinant Proteins/genetics , Escherichia coli/genetics , Glutathione Transferase/chemistry , Glutathione Transferase/metabolism , Models, Molecular , Plant Proteins/chemistry , Plant Proteins/metabolism , Populus/enzymology , Protein Engineering , Recombinant Proteins/chemistry , Recombinant Proteins/metabolism
15.
BMC Bioinformatics ; 7: 285, 2006 Jun 06.
Article in English | MEDLINE | ID: mdl-16756672

ABSTRACT

BACKGROUND: Direct synthesis of genes is rapidly becoming the most efficient way to make functional genetic constructs and enables applications such as codon optimization, RNAi resistant genes and protein engineering. Here we introduce a software tool that drastically facilitates the design of synthetic genes. RESULTS: Gene Designer is a stand-alone software for fast and easy design of synthetic DNA segments. Users can easily add, edit and combine genetic elements such as promoters, open reading frames and tags through an intuitive drag-and-drop graphic interface and a hierarchical DNA/Protein object map. Using advanced optimization algorithms, open reading frames within the DNA construct can readily be codon optimized for protein expression in any host organism. Gene Designer also includes features such as a real-time sliding calculator of oligonucleotide annealing temperatures, sequencing primer generator, tools for avoidance or inclusion of restriction sites, and options to maximize or minimize sequence identity to a reference. CONCLUSION: Gene Designer is an expandable Synthetic Biology workbench suitable for molecular biologists interested in the de novo creation of genetic constructs.


Subject(s)
DNA/chemistry , DNA/genetics , Genes, Synthetic/genetics , Genetic Engineering/methods , Sequence Analysis, DNA/methods , Software , Systems Biology/methods , Base Sequence , Computer-Aided Design , Drug Design , Molecular Sequence Data , User-Computer Interface
16.
Curr Opin Chem Biol ; 9(2): 202-9, 2005 Apr.
Article in English | MEDLINE | ID: mdl-15811806

ABSTRACT

There are two main reasons to try to predict an enzyme's function from its sequence. The first is to identify the components and thus the functional capabilities of an organism, the second is to create enzymes with specific properties. Genomics, expression analysis, proteomics and metabonomics are largely directed towards understanding how information flows from DNA sequence to protein functions within an organism. This review focuses on information flow in the opposite direction: the applicability of what is being learned from natural enzymes to improve methods for catalyst design.


Subject(s)
Enzymes/chemistry , Protein Engineering , Sequence Analysis, Protein , Animals , Humans , Structural Homology, Protein
17.
J Mol Biol ; 328(5): 1061-9, 2003 May 16.
Article in English | MEDLINE | ID: mdl-12729741

ABSTRACT

During protein evolution, amino acids change due to a combination of functional constraints and genetic drift. Proteins frequently contain pairs of amino acids that appear to change together (covariation). Analysis of covariation from naturally occurring sets of orthologs cannot distinguish between residue pairs retained by functional requirements of the protein and those pairs existing due to changes along a common evolutionary path. Here, we have separated the two types of covariation by independently recombining every naturally occurring amino acid variant within a set of 15 subtilisin orthologs. Our analysis shows that in this family of subtilisin orthologs, almost all possible pairwise combinations of amino acids can coexist. This suggests that amino acid covariation found in the subtilisin orthologs is almost entirely due to common ancestral origin of the changes rather than functional constraints. We conclude that naturally occurring sequence diversity can be used to identify positions that can vary independently without destroying protein function.


Subject(s)
Amino Acid Substitution , Evolution, Molecular , Subtilisins/genetics , Amino Acid Sequence , Bacillus/enzymology , Bacillus/genetics , Binding Sites , Directed Molecular Evolution , Genetic Variation , Models, Molecular , Phylogeny , Protein Conformation , Subtilisins/chemistry , Subtilisins/physiology , Thermodynamics
18.
Curr Opin Biotechnol ; 14(4): 366-70, 2003 Aug.
Article in English | MEDLINE | ID: mdl-12943844

ABSTRACT

Complex multivariate engineering problems are commonplace and not unique to protein engineering. Mathematical and data-mining tools developed in other fields of engineering have now been applied to analyze sequence-activity relationships of peptides and proteins and to assist in the design of proteins and peptides with specified properties. Decreasing costs of DNA sequencing in conjunction with methods to quickly synthesize statistically representative sets of proteins allow modern heuristic statistics to be applied to protein engineering. This provides an alternative approach to expensive assays or unreliable high-throughput surrogate screens.


Subject(s)
Computational Biology/methods , Enzymes/chemistry , Protein Engineering/trends , Algorithms , Amino Acid Sequence , Amino Acid Substitution , Catalysis , Computer-Aided Design , Drug Design , Enzymes/genetics , Enzymes/metabolism , Kinetics , Models, Statistical , Models, Theoretical , Mutagenesis , Neural Networks, Computer
19.
ACS Synth Biol ; 4(3): 221-7, 2015 Mar 20.
Article in English | MEDLINE | ID: mdl-24905764

ABSTRACT

We have used design of experiments (DOE) and systematic variance to efficiently explore glutathione transferase substrate specificities caused by amino acid substitutions. Amino acid substitutions selected using phylogenetic analysis were synthetically combined using a DOE design to create an information-rich set of gene variants, termed infologs. We used machine learning to identify and quantify protein sequence-function relationships against 14 different substrates. The resulting models were quantitative and predictive, serving as a guide for engineering of glutathione transferase activity toward a diverse set of herbicides. Predictive quantitative models like those presented here have broad applicability for bioengineering.


Subject(s)
Amino Acid Substitution/genetics , Glutathione Transferase/chemistry , Herbicide Resistance/genetics , Plant Proteins/chemistry , Synthetic Biology/methods , Triticum/genetics , Amino Acid Sequence , Glutathione Transferase/genetics , Glutathione Transferase/metabolism , Machine Learning , Molecular Sequence Data , Plant Proteins/genetics , Plant Proteins/metabolism , Research Design , Sequence Analysis, Protein
20.
Trends Biotechnol ; 22(7): 346-53, 2004 Jul.
Article in English | MEDLINE | ID: mdl-15245907

ABSTRACT

The expression of functional proteins in heterologous hosts is a cornerstone of modern biotechnology. Unfortunately, proteins are often difficult to express outside their original context. They might contain codons that are rarely used in the desired host, come from organisms that use non-canonical code or contain expression-limiting regulatory elements within their coding sequence. Improvements in the speed and cost of gene synthesis have facilitated the complete redesign of entire gene sequences to maximize the likelihood of high protein expression. Redesign strategies are discussed here, including modification of translation initiation regions, alteration of mRNA structural elements and use of different codon biases.


Subject(s)
Codon/genetics , Gene Expression Regulation/genetics , Protein Engineering/methods , Proteins/genetics , Recombinant Proteins/biosynthesis , Recombinant Proteins/genetics , Sequence Analysis, DNA/methods , Cloning, Molecular/methods
SELECTION OF CITATIONS
SEARCH DETAIL