Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 19 de 19
Filter
Add more filters










Publication year range
1.
Nature ; 623(7989): 1070-1078, 2023 Nov.
Article in English | MEDLINE | ID: mdl-37968394

ABSTRACT

Three billion years of evolution has produced a tremendous diversity of protein molecules1, but the full potential of proteins is likely to be much greater. Accessing this potential has been challenging for both computation and experiments because the space of possible protein molecules is much larger than the space of those likely to have functions. Here we introduce Chroma, a generative model for proteins and protein complexes that can directly sample novel protein structures and sequences, and that can be conditioned to steer the generative process towards desired properties and functions. To enable this, we introduce a diffusion process that respects the conformational statistics of polymer ensembles, an efficient neural architecture for molecular systems that enables long-range reasoning with sub-quadratic scaling, layers for efficiently synthesizing three-dimensional structures of proteins from predicted inter-residue geometries and a general low-temperature sampling algorithm for diffusion models. Chroma achieves protein design as Bayesian inference under external constraints, which can involve symmetries, substructure, shape, semantics and even natural-language prompts. The experimental characterization of 310 proteins shows that sampling from Chroma results in proteins that are highly expressed, fold and have favourable biophysical properties. The crystal structures of two designed proteins exhibit atomistic agreement with Chroma samples (a backbone root-mean-square deviation of around 1.0 Å). With this unified approach to protein design, we hope to accelerate the programming of protein matter to benefit human health, materials science and synthetic biology.


Subject(s)
Algorithms , Computer Simulation , Protein Conformation , Proteins , Humans , Bayes Theorem , Directed Molecular Evolution , Machine Learning , Models, Molecular , Protein Folding , Proteins/chemistry , Proteins/metabolism , Semantics , Synthetic Biology/methods , Synthetic Biology/trends
2.
Cell Syst ; 10(6): 526-534.e3, 2020 06 24.
Article in English | MEDLINE | ID: mdl-32553183

ABSTRACT

Gene regulation networks allow organisms to adapt to diverse environmental niches. However, the constraints underlying the evolution of gene regulation remain ill defined. Here, we show that partial order-a concept that ranks network output levels as a function of different input signals-identifies such constraints. We tested our predictions by experimentally evolving an engineered signal-integrating network in multiple environments. We find that populations: (1) expand in fitness space along the Pareto-optimal front associated with conflicts in regulatory demands, by fine-tuning binding affinities within the network, and (2) expand beyond the Pareto-optimal front through changes in the network structure. Our constraint predictions are based only on partial order and do not require information on the network architecture or underlying genetics. Overall, our findings show that limited knowledge of current regulatory phenotypes can provide predictions on future evolutionary constraints.


Subject(s)
Gene Regulatory Networks/genetics , Evolution, Molecular , Humans
3.
Annu Rev Biophys ; 49: 181-197, 2020 05 06.
Article in English | MEDLINE | ID: mdl-32040932

ABSTRACT

The limits of evolution have long fascinated biologists. However, the causes of evolutionary constraint have remained elusive due to a poor mechanistic understanding of studied phenotypes. Recently, a range of innovative approaches have leveraged mechanistic information on regulatory networks and cellular biology. These methods combine systems biology models with population and single-cell quantification and with new genetic tools, and they have been applied to a range of complex cellular functions and engineered networks. In this article, we review these developments, which are revealing the mechanistic causes of epistasis at different levels of biological organization-in molecular recognition, within a single regulatory network, and between different networks-providing first indications of predictable features of evolutionary constraint.


Subject(s)
Evolution, Molecular , Systems Biology/methods , Epistasis, Genetic , Gene Regulatory Networks , Phenotype
4.
Cell Syst ; 10(1): 15-24.e5, 2020 01 22.
Article in English | MEDLINE | ID: mdl-31838147

ABSTRACT

Natural evolution encodes rich information about the structure and function of biomolecules in the genetic record. Previously, statistical analysis of co-variation patterns in natural protein families has enabled the accurate computation of 3D structures. Here, we explored generating similar information by experimental evolution, starting from a single gene and performing multiple cycles of in vitro mutagenesis and functional selection in Escherichia coli. We evolved two antibiotic resistance proteins, ß-lactamase PSE1 and acetyltransferase AAC6, and obtained hundreds of thousands of diverse functional sequences. Using evolutionary coupling analysis, we inferred residue interaction constraints that were in agreement with contacts in known 3D structures, confirming genetic encoding of structural constraints in the selected sequences. Computational protein folding with interaction constraints then yielded 3D structures with the same fold as natural relatives. This work lays the foundation for a new experimental method (3Dseq) for protein structure determination, combining evolution experiments with inference of residue interactions from sequence information. A record of this paper's Transparent Peer Review process is included in the Supplemental Information.


Subject(s)
Evolution, Molecular , Proteins/chemistry , Humans , Protein Conformation
5.
Nat Commun ; 10(1): 4213, 2019 09 16.
Article in English | MEDLINE | ID: mdl-31527666

ABSTRACT

Understanding the pattern of epistasis-the non-independence of mutations-is critical for relating genotype and phenotype. However, the combinatorial complexity of potential epistatic interactions has severely limited the analysis of this problem. Using new mutational approaches, we report a comprehensive experimental study of all 213 mutants that link two phenotypically distinct variants of the Entacmaea quadricolor fluorescent protein-an opportunity to examine epistasis up to the 13th order. The data show the existence of many high-order epistatic interactions between mutations, but also reveal extraordinary sparsity, enabling novel experimental and computational strategies for learning the relevant epistasis. We demonstrate that such information, in turn, can be used to accurately predict phenotypes in practical situations where the number of measurements is limited. Finally, we show how the observed epistasis shapes the solution space of single-mutation trajectories between the parental fluorescent proteins, informative about the protein's evolutionary potential. This work provides conceptual and experimental strategies to profoundly characterize epistasis in a protein, relevant to both natural and laboratory evolution.


Subject(s)
Epistasis, Genetic , Proteins/genetics , Evolution, Molecular , Genotype , Mutation , Phenotype , Proteins/chemistry
6.
Nat Genet ; 51(7): 1170-1176, 2019 07.
Article in English | MEDLINE | ID: mdl-31209393

ABSTRACT

We describe an experimental method of three-dimensional (3D) structure determination that exploits the increasing ease of high-throughput mutational scans. Inspired by the success of using natural, evolutionary sequence covariation to compute protein and RNA folds, we explored whether 'laboratory', synthetic sequence variation might also yield 3D structures. We analyzed five large-scale mutational scans and discovered that the pairs of residues with the largest positive epistasis in the experiments are sufficient to determine the 3D fold. We show that the strongest epistatic pairings from genetic screens of three proteins, a ribozyme and a protein interaction reveal 3D contacts within and between macromolecules. Using these experimental epistatic pairs, we compute ab initio folds for a GB1 domain (within 1.8 Å of the crystal structure) and a WW domain (2.1 Å). We propose strategies that reduce the number of mutants needed for contact prediction, suggesting that genomics-based techniques can efficiently predict 3D structure.


Subject(s)
Adaptor Proteins, Signal Transducing/chemistry , Bacterial Proteins/chemistry , Epistasis, Genetic , Mutation , Poly(A)-Binding Proteins/chemistry , Protein Conformation , RNA, Catalytic/chemistry , Saccharomyces cerevisiae Proteins/chemistry , Transcription Factors/chemistry , Adaptor Proteins, Signal Transducing/genetics , Bacterial Proteins/genetics , Humans , Poly(A)-Binding Proteins/genetics , Protein Domains , Protein Folding , RNA, Catalytic/genetics , Saccharomyces cerevisiae Proteins/genetics , Transcription Factors/genetics , YAP-Signaling Proteins
7.
Methods Mol Biol ; 1851: 123-134, 2019.
Article in English | MEDLINE | ID: mdl-30298395

ABSTRACT

Defining the extent of epistasis-the nonindependence of the effects of mutations-is essential for understanding the relationship of genotype, phenotype, and fitness in biological systems. The applications cover many areas of biological research, including biochemistry, genomics, protein and systems engineering, medicine, and evolutionary biology. However, the quantitative definitions of epistasis vary among fields, and the analysis beyond just pairwise effects can be problematic. Here, we demonstrate the application of a particular mathematical formalism, the weighted Walsh-Hadamard transform, which unifies a number of different definitions of epistasis. We provide a computational implementation of such analysis using a computer-generated higher-order mutational dataset. We discuss general considerations regarding the null hypothesis for independent mutational effects, which then allows a quantitative identification of epistasis in an experimental dataset.


Subject(s)
Mutation/genetics , Proteins/genetics , Biological Evolution , Computational Biology , Epistasis, Genetic/genetics , Evolution, Molecular , Genotype , Models, Genetic , Selection, Genetic/genetics
8.
Nat Biotechnol ; 35(2): 128-135, 2017 02.
Article in English | MEDLINE | ID: mdl-28092658

ABSTRACT

Many high-throughput experimental technologies have been developed to assess the effects of large numbers of mutations (variation) on phenotypes. However, designing functional assays for these methods is challenging, and systematic testing of all combinations is impossible, so robust methods to predict the effects of genetic variation are needed. Most prediction methods exploit evolutionary sequence conservation but do not consider the interdependencies of residues or bases. We present EVmutation, an unsupervised statistical method for predicting the effects of mutations that explicitly captures residue dependencies between positions. We validate EVmutation by comparing its predictions with outcomes of high-throughput mutagenesis experiments and measurements of human disease mutations and show that it outperforms methods that do not account for epistasis. EVmutation can be used to assess the quantitative effects of mutations in genes of any organism. We provide pre-computed predictions for ∼7,000 human proteins at http://evmutation.org/.


Subject(s)
Conserved Sequence/genetics , DNA Mutational Analysis/methods , Epistasis, Genetic/genetics , Genetic Variation/genetics , High-Throughput Nucleotide Sequencing/methods , Proteome/genetics , Amino Acid Sequence/genetics , Evolution, Molecular , Humans , Molecular Sequence Data , Mutation/genetics , Proteome/chemistry
10.
PLoS Genet ; 9(6): e1003580, 2013 Jun.
Article in English | MEDLINE | ID: mdl-23825963

ABSTRACT

The epistatic interactions that underlie evolutionary constraint have mainly been studied for constant external conditions. However, environmental changes may modulate epistasis and hence affect genetic constraints. Here we investigate genetic constraints in the adaptive evolution of a novel regulatory function in variable environments, using the lac repressor, LacI, as a model system. We have systematically reconstructed mutational trajectories from wild type LacI to three different variants that each exhibit an inverse response to the inducing ligand IPTG, and analyzed the higher-order interactions between genetic and environmental changes. We find epistasis to depend strongly on the environment. As a result, mutational steps essential to inversion but inaccessible by positive selection in one environment, become accessible in another. We present a graphical method to analyze the observed complex higher-order interactions between multiple mutations and environmental change, and show how the interactions can be explained by a combination of mutational effects on allostery and thermodynamic stability. This dependency of genetic constraint on the environment should fundamentally affect evolutionary dynamics and affects the interpretation of phylogenetic data.


Subject(s)
Epistasis, Genetic , Escherichia coli K12/genetics , Escherichia coli Proteins/genetics , Evolution, Molecular , Lac Repressors/genetics , Escherichia coli K12/growth & development , Gene-Environment Interaction , Models, Genetic , Mutation , Phylogeny , Thermodynamics
11.
Curr Opin Biotechnol ; 24(4): 797-802, 2013 Aug.
Article in English | MEDLINE | ID: mdl-23684729

ABSTRACT

Whether organisms evolve to perform tasks optimally has intrigued biologists since Lamarck and Darwin. Optimality models have been used to study diverse properties such as shape, locomotion, and behavior. However, without access to the genetic underpinnings or the ability to manipulate biological functions, it has been difficult to understand an organism's intrinsic potential and limitations. Now, novel experiments are overcoming these technical obstacles and have begun to test optimality in more quantitative terms. With the use of simple model systems, genetic engineering, and mathematical modeling, one can independently quantify the prevailing selective pressures and optimal phenotypes. These studies have given an exciting view into the evolutionary potential and constraints of biological systems, and hold the promise to further test the limits of predicting future evolutionary change.


Subject(s)
Biological Evolution , Models, Genetic , Escherichia coli/genetics , Escherichia coli/physiology , Phenotype , Selection, Genetic , Synthetic Biology
12.
Nature ; 491(7422): 138-42, 2012 Nov 01.
Article in English | MEDLINE | ID: mdl-23041932

ABSTRACT

Statistical analysis of protein evolution suggests a design for natural proteins in which sparse networks of coevolving amino acids (termed sectors) comprise the essence of three-dimensional structure and function. However, proteins are also subject to pressures deriving from the dynamics of the evolutionary process itself--the ability to tolerate mutation and to be adaptive to changing selection pressures. To understand the relationship of the sector architecture to these properties, we developed a high-throughput quantitative method for a comprehensive single-mutation study in which every position is substituted individually to every other amino acid. Using a PDZ domain (PSD95(pdz3)) model system, we show that sector positions are functionally sensitive to mutation, whereas non-sector positions are more tolerant to substitution. In addition, we find that adaptation to a new binding specificity initiates exclusively through variation within sector residues. A combination of just two sector mutations located near and away from the ligand-binding site suffices to switch the binding specificity of PSD95(pdz3) quantitatively towards a class-switching ligand. The localization of functional constraint and adaptive variation within the sector has important implications for understanding and engineering proteins.


Subject(s)
Adaptation, Physiological , Amino Acid Substitution , Mutant Proteins/chemistry , PDZ Domains/genetics , PDZ Domains/physiology , Proteins/chemistry , Proteins/metabolism , Adaptation, Physiological/genetics , Adaptation, Physiological/physiology , Amino Acid Sequence , Binding Sites/genetics , Evolution, Molecular , Ligands , Models, Molecular , Molecular Sequence Data , Mutant Proteins/genetics , Mutant Proteins/metabolism , Mutation , Proteins/genetics
13.
Cell ; 146(3): 462-70, 2011 Aug 05.
Article in English | MEDLINE | ID: mdl-21802129

ABSTRACT

Cellular regulation is believed to evolve in response to environmental variability. However, this has been difficult to test directly. Here, we show that a gene regulation system evolves to the optimal regulatory response when challenged with variable environments. We engineered a genetic module subject to regulation by the lac repressor (LacI) in E. coli, whose expression is beneficial in one environmental condition and detrimental in another. Measured tradeoffs in fitness between environments predict the competition between regulatory phenotypes. We show that regulatory evolution in adverse environments is delayed at specific boundaries in the phenotype space of the regulatory LacI protein. Once this constraint is relieved by mutation, adaptation proceeds toward the optimum, yielding LacI with an altered allosteric mechanism that enables an opposite response to its regulatory ligand IPTG. Our results indicate that regulatory evolution can be understood in terms of tradeoff optimization theory.


Subject(s)
Biological Evolution , Escherichia coli/genetics , Gene Expression Regulation, Bacterial , Allosteric Regulation , Escherichia coli Proteins/metabolism , Genetic Fitness , Isopropyl Thiogalactoside/metabolism , Lac Operon , Lac Repressors/metabolism , Mutation
14.
BMC Syst Biol ; 5: 128, 2011 Aug 16.
Article in English | MEDLINE | ID: mdl-21846366

ABSTRACT

BACKGROUND: How transcriptionally regulated gene expression evolves under natural selection is an open question. The cost and benefit of gene expression are the driving factors. While the former can be determined by gratuitous induction, the latter is difficult to measure directly. RESULTS: We addressed this problem by decoupling the regulatory and metabolic function of the Escherichia coli lac system, using an inducer that cannot be metabolized and a carbon source that does not induce. Growth rate measurements directly identified the induced expression level that maximizes the metabolism benefits minus the protein production costs, without relying on models. Using these results, we established a controlled mismatch between sensing and metabolism, resulting in sub-optimal transcriptional regulation with the potential to improve by evolution. Next, we tested the evolutionary response by serial transfer. Constant environments showed cells evolving to the predicted expression optimum. Phenotypes with decreased expression emerged several hundred generations later than phenotypes with increased expression, indicating a higher genetic accessibility of the latter. Environments alternating between low and high expression demands resulted in overall rather than differential changes in expression, which is explained by the concave shape of the cross-environmental tradeoff curve that limits the selective advantage of altering the regulatory response. CONCLUSIONS: This work indicates that the decoupling of regulatory and metabolic functions allows one to directly measure the costs and benefits that underlie the natural selection of gene regulation. Regulated gene expression is shown to evolve within several hundreds of generations to optima that are predicted by these costs and benefits. The results provide a step towards a quantitative understanding of the adaptive origins of regulatory systems.


Subject(s)
Biological Evolution , Environment , Gene Expression Regulation, Fungal/physiology , Models, Biological , Regulatory Elements, Transcriptional/physiology , Selection, Genetic , Computer Simulation , Escherichia coli , Isopropyl Thiogalactoside , Lac Operon/genetics , Polyglutamic Acid/analogs & derivatives , Polylysine/analogs & derivatives
15.
Genes Dev ; 25(16): 1674-9, 2011 Aug 15.
Article in English | MEDLINE | ID: mdl-21852532

ABSTRACT

We have determined the cistrome and transcriptome for the nuclear receptor liver receptor homolog-1 (LRH-1) in exocrine pancreas. Chromatin immunoprecipitation (ChIP)-seq and RNA-seq analyses reveal that LRH-1 directly induces expression of genes encoding digestive enzymes and secretory and mitochondrial proteins. LRH-1 cooperates with the pancreas transcription factor 1-L complex (PTF1-L) in regulating exocrine pancreas-specific gene expression. Elimination of LRH-1 in adult mice reduced the concentration of several lipases and proteases in pancreatic fluid and impaired pancreatic fluid secretion in response to cholecystokinin. Thus, LRH-1 is a key regulator of the exocrine pancreas-specific transcriptional network required for the production and secretion of pancreatic fluid.


Subject(s)
Gene Regulatory Networks , Pancreas, Exocrine/metabolism , Receptors, Cytoplasmic and Nuclear/genetics , Transcription Factors/genetics , Animals , Antineoplastic Agents, Hormonal/pharmacology , Base Sequence , Blotting, Western , Chromatin Immunoprecipitation , Down-Regulation/drug effects , Female , Gene Expression Profiling , Humans , Lipase/genetics , Lipase/metabolism , Male , Mice , Mice, Knockout , Mice, Transgenic , Molecular Sequence Data , Pancreas, Exocrine/drug effects , Peptide Hydrolases/genetics , Peptide Hydrolases/metabolism , Receptors, Cytoplasmic and Nuclear/metabolism , Reverse Transcriptase Polymerase Chain Reaction , Sequence Homology, Amino Acid , Tamoxifen/pharmacology , Transcription Factors/metabolism
16.
J Theor Biol ; 272(1): 141-4, 2011 Mar 07.
Article in English | MEDLINE | ID: mdl-21167837

ABSTRACT

Having multiple peaks within fitness landscapes critically affects the course of evolution, but whether their presence imposes specific requirements at the level of genetic interactions remains unestablished. Here we show that to exhibit multiple fitness peaks, a biological system must contain reciprocal sign epistatic interactions, which are defined as genetic changes that are separately unfavorable but jointly advantageous. Using Morse theory, we argue that it is impossible to formulate a sufficient condition for multiple peaks in terms of local genetic interactions. These findings indicate that systems incapable of reciprocal sign epistasis will always possess a single fitness peak. However, reciprocal sign epistasis should be pervasive in nature as it is a logical consequence of specificity in molecular interactions. The results thus predict that specific molecular interactions may yield multiple fitness peaks, which can be tested experimentally.


Subject(s)
Epistasis, Genetic , Genetic Fitness , Alleles , Models, Genetic
17.
Nature ; 445(7126): 383-6, 2007 Jan 25.
Article in English | MEDLINE | ID: mdl-17251971

ABSTRACT

When attempting to understand evolution, we traditionally rely on analysing evolutionary outcomes, despite the fact that unseen intermediates determine its course. A handful of recent studies has begun to explore these intermediate evolutionary forms, which can be reconstructed in the laboratory. With this first view on empirical evolutionary landscapes, we can now finally start asking why particular evolutionary paths are taken.


Subject(s)
Biological Evolution , Selection, Genetic , Animals , Binding Sites , Epistasis, Genetic , Models, Molecular , Mutagenesis
18.
PLoS Comput Biol ; 2(5): e58, 2006 May.
Article in English | MEDLINE | ID: mdl-16733549

ABSTRACT

Ample evidence has accumulated for the evolutionary importance of duplication events. However, little is known about the ensuing step-by-step divergence process and the selective conditions that allow it to progress. Here we present a computational study on the divergence of two repressors after duplication. A central feature of our approach is that intermediate phenotypes can be quantified through the use of in vivo measured repression strengths of Escherichia coli lac mutants. Evolutionary pathways are constructed by multiple rounds of single base pair substitutions and selection for tight and independent binding. Our analysis indicates that when a duplicated repressor co-diverges together with its binding site, the fitness landscape allows funneling to a new regulatory interaction with early increases in fitness. We find that neutral mutations do not play an essential role, which is important for substantial divergence probabilities. By varying the selective pressure we can pinpoint the necessary ingredients for the observed divergence. Our findings underscore the importance of coevolutionary mechanisms in regulatory networks, and should be relevant for the evolution of protein-DNA as well as protein-protein interactions.


Subject(s)
Computational Biology/methods , Evolution, Molecular , Genetic Techniques , Mutation , DNA/genetics , Escherichia coli/genetics , Models, Genetic , Operator Regions, Genetic , Phenotype , Protein Binding , Repressor Proteins/genetics
19.
Phys Rev E Stat Nonlin Soft Matter Phys ; 65(4 Pt 2B): 046603, 2002 Apr.
Article in English | MEDLINE | ID: mdl-12006043

ABSTRACT

We present measurements of speckle in a random laser. We analyze its first-order statistics and show that, contrary to what might be expected for passive systems, analyses of the intensity distribution P(I) and the speckle spot size do provide information about light transport inside the system. P(I) is used to determine the degree to which an incident probe is amplified by the random laser. The shrinking speckle spot size reflects the change in path length distribution P(Lambda); we deduce that the average path length in the studied random laser is two times longer above threshold than in a passive diffusive system.

SELECTION OF CITATIONS
SEARCH DETAIL
...