Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 83
Filter
1.
Nature ; 533(7603): 397-401, 2016 05 19.
Article in English | MEDLINE | ID: mdl-27193686

ABSTRACT

Fitness landscapes depict how genotypes manifest at the phenotypic level and form the basis of our understanding of many areas of biology, yet their properties remain elusive. Previous studies have analysed specific genes, often using their function as a proxy for fitness, experimentally assessing the effect on function of single mutations and their combinations in a specific sequence or in different sequences. However, systematic high-throughput studies of the local fitness landscape of an entire protein have not yet been reported. Here we visualize an extensive region of the local fitness landscape of the green fluorescent protein from Aequorea victoria (avGFP) by measuring the native function (fluorescence) of tens of thousands of derivative genotypes of avGFP. We show that the fitness landscape of avGFP is narrow, with 3/4 of the derivatives with a single mutation showing reduced fluorescence and half of the derivatives with four mutations being completely non-fluorescent. The narrowness is enhanced by epistasis, which was detected in up to 30% of genotypes with multiple mutations and mostly occurred through the cumulative effect of slightly deleterious mutations causing a threshold-like decrease in protein stability and a concomitant loss of fluorescence. A model of orthologous sequence divergence spanning hundreds of millions of years predicted the extent of epistasis in our data, indicating congruence between the fitness landscape properties at the local and global scales. The characterization of the local fitness landscape of avGFP has important implications for several fields including molecular evolution, population genetics and protein design.


Subject(s)
Genetic Fitness , Green Fluorescent Proteins/genetics , Green Fluorescent Proteins/metabolism , Animals , Epistasis, Genetic , Evolution, Molecular , Fluorescence , Genetic Association Studies , Genotype , Hydrozoa/chemistry , Hydrozoa/genetics , Mutant Proteins/genetics , Mutant Proteins/metabolism , Mutation/genetics , Phenotype
2.
Bull Math Biol ; 84(8): 74, 2022 06 17.
Article in English | MEDLINE | ID: mdl-35713756

ABSTRACT

Empirical essays of fitness landscapes suggest that they may be rugged, that is having multiple fitness peaks. Such fitness landscapes, those that have multiple peaks, necessarily have special local structures, called reciprocal sign epistasis (Poelwijk et al. in J Theor Biol 272:141-144, 2011). Here, we investigate the quantitative relationship between the number of fitness peaks and the number of reciprocal sign epistatic interactions. Previously, it has been shown (Poelwijk et al. in J Theor Biol 272:141-144, 2011) that pairwise reciprocal sign epistasis is a necessary but not sufficient condition for the existence of multiple peaks. Applying discrete Morse theory, which to our knowledge has never been used in this context, we extend this result by giving the minimal number of reciprocal sign epistatic interactions required to create a given number of peaks.


Subject(s)
Epistasis, Genetic , Models, Genetic , Genetic Fitness , Mathematical Concepts , Models, Biological , Mutation
3.
PLoS Genet ; 15(4): e1008079, 2019 04.
Article in English | MEDLINE | ID: mdl-30969963

ABSTRACT

Characterizing the fitness landscape, a representation of fitness for a large set of genotypes, is key to understanding how genetic information is interpreted to create functional organisms. Here we determined the evolutionarily-relevant segment of the fitness landscape of His3, a gene coding for an enzyme in the histidine synthesis pathway, focusing on combinations of amino acid states found at orthologous sites of extant species. Just 15% of amino acids found in yeast His3 orthologues were always neutral while the impact on fitness of the remaining 85% depended on the genetic background. Furthermore, at 67% of sites, amino acid replacements were under sign epistasis, having both strongly positive and negative effect in different genetic backgrounds. 46% of sites were under reciprocal sign epistasis. The fitness impact of amino acid replacements was influenced by only a few genetic backgrounds but involved interaction of multiple sites, shaping a rugged fitness landscape in which many of the shortest paths between highly fit genotypes are inaccessible.


Subject(s)
Evolution, Molecular , Fungal Proteins/genetics , Fungal Proteins/metabolism , Genetic Fitness , Yeasts/genetics , Yeasts/metabolism , Amino Acid Sequence , Amino Acid Substitution , Amino Acids/genetics , Amino Acids/metabolism , Epistasis, Genetic , Fungal Proteins/chemistry , Genes, Fungal , Genotype , Hydro-Lyases/chemistry , Hydro-Lyases/genetics , Hydro-Lyases/metabolism , Models, Genetic , Models, Molecular , Phylogeny , Saccharomyces cerevisiae/genetics , Saccharomyces cerevisiae/metabolism , Saccharomyces cerevisiae Proteins/chemistry , Saccharomyces cerevisiae Proteins/genetics , Saccharomyces cerevisiae Proteins/metabolism
4.
Proc Natl Acad Sci U S A ; 115(50): 12728-12732, 2018 12 11.
Article in English | MEDLINE | ID: mdl-30478037

ABSTRACT

Bioluminescence is found across the entire tree of life, conferring a spectacular set of visually oriented functions from attracting mates to scaring off predators. Half a dozen different luciferins, molecules that emit light when enzymatically oxidized, are known. However, just one biochemical pathway for luciferin biosynthesis has been described in full, which is found only in bacteria. Here, we report identification of the fungal luciferase and three other key enzymes that together form the biosynthetic cycle of the fungal luciferin from caffeic acid, a simple and widespread metabolite. Introduction of the identified genes into the genome of the yeast Pichia pastoris along with caffeic acid biosynthesis genes resulted in a strain that is autoluminescent in standard media. We analyzed evolution of the enzymes of the luciferin biosynthesis cycle and found that fungal bioluminescence emerged through a series of events that included two independent gene duplications. The retention of the duplicated enzymes of the luciferin pathway in nonluminescent fungi shows that the gene duplication was followed by functional sequence divergence of enzymes of at least one gene in the biosynthetic pathway and suggests that the evolution of fungal bioluminescence proceeded through several closely related stepping stone nonluminescent biochemical reactions with adaptive roles. The availability of a complete eukaryotic luciferin biosynthesis pathway provides several applications in biomedicine and bioengineering.


Subject(s)
Fungi/genetics , Luminescent Proteins/genetics , Amino Acid Sequence , Animals , Biosynthetic Pathways/genetics , Caffeic Acids , Cell Line , Cell Line, Tumor , Female , Gene Duplication/genetics , HEK293 Cells , HeLa Cells , Humans , Mice , Mice, Inbred BALB C , Sequence Alignment , Xenopus laevis
6.
Bioinformatics ; 2019 Nov 19.
Article in English | MEDLINE | ID: mdl-31742320

ABSTRACT

MOTIVATION: Epistasis, the context-dependence of the contribution of an amino acid substitution to fitness, is common in evolution. To detect epistasis, fitness must be measured for at least four genotypes: the reference genotype, two different single mutants and a double mutant with both of the single mutations. For higher-order epistasis of the order n, fitness has to be measured for all 2n genotypes of an n-dimensional hypercube in genotype space forming a "combinatorially complete dataset". So far, only a handful of such datasets have been produced by manual curation. Concurrently, random mutagenesis experiments have produced measurements of fitness and other phenotypes in a high-throughput manner, potentially containing a number of combinatorially complete datasets. RESULTS: We present an effective recursive algorithm for finding all hypercube structures in random mutagenesis experimental data. To test the algorithm, we applied it to the data from a recent HIS3 protein dataset and found all 199,847,053 unique combinatorially complete genotype combinations of dimensionality ranging from two to twelve. The algorithm may be useful for researchers looking for higher-order epistasis in their high-throughput experimental data. AVAILABILITY: https://github.com/ivankovlab/HypercubeME.git. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

7.
Nature ; 510(7503): 109-14, 2014 Jun 05.
Article in English | MEDLINE | ID: mdl-24847885

ABSTRACT

The origins of neural systems remain unresolved. In contrast to other basal metazoans, ctenophores (comb jellies) have both complex nervous and mesoderm-derived muscular systems. These holoplanktonic predators also have sophisticated ciliated locomotion, behaviour and distinct development. Here we present the draft genome of Pleurobrachia bachei, Pacific sea gooseberry, together with ten other ctenophore transcriptomes, and show that they are remarkably distinct from other animal genomes in their content of neurogenic, immune and developmental genes. Our integrative analyses place Ctenophora as the earliest lineage within Metazoa. This hypothesis is supported by comparative analysis of multiple gene families, including the apparent absence of HOX genes, canonical microRNA machinery, and reduced immune complement in ctenophores. Although two distinct nervous systems are well recognized in ctenophores, many bilaterian neuron-specific genes and genes of 'classical' neurotransmitter pathways either are absent or, if present, are not expressed in neurons. Our metabolomic and physiological data are consistent with the hypothesis that ctenophore neural systems, and possibly muscle specification, evolved independently from those in other animals.


Subject(s)
Ctenophora/genetics , Evolution, Molecular , Genome/genetics , Nervous System , Animals , Ctenophora/classification , Ctenophora/immunology , Ctenophora/physiology , Genes, Developmental , Genes, Homeobox , Mesoderm/metabolism , Metabolomics , MicroRNAs , Molecular Sequence Data , Muscles/physiology , Nervous System/metabolism , Neurons/metabolism , Neurotransmitter Agents , Phylogeny , Transcriptome/genetics
8.
Bioinformatics ; 34(21): 3653-3658, 2018 11 01.
Article in English | MEDLINE | ID: mdl-29722803

ABSTRACT

Motivation: Computational prediction of the effect of mutations on protein stability is used by researchers in many fields. The utility of the prediction methods is affected by their accuracy and bias. Bias, a systematic shift of the predicted change of stability, has been noted as an issue for several methods, but has not been investigated systematically. Presence of the bias may lead to misleading results especially when exploring the effects of combination of different mutations. Results: Here we use a protocol to measure the bias as a function of the number of introduced mutations. It is based on a self-consistency test of the reciprocity the effect of a mutation. An advantage of the used approach is that it relies solely on crystal structures without experimentally measured stability values. We applied the protocol to four popular algorithms predicting change of protein stability upon mutation, FoldX, Eris, Rosetta and I-Mutant, and found an inherent bias. For one program, FoldX, we manage to substantially reduce the bias using additional relaxation by Modeller. Authors using algorithms for predicting effects of mutations should be aware of the bias described here. Availability and implementation: All calculations were implemented by in-house PERL scripts. Supplementary information: Supplementary data are available at Bioinformatics online. Note: The article 10.1093/bioinformatics/bty348, published alongside this paper, also addresses the problem of biases in protein stability change predictions.


Subject(s)
Proteins/genetics , Software , Algorithms , Bias , Mutation , Protein Stability
9.
Trends Genet ; 31(1): 24-33, 2015 Jan.
Article in English | MEDLINE | ID: mdl-25438718

ABSTRACT

The factors that determine the tempo and mode of protein evolution continue to be a central question in molecular evolution. Traditionally, studies of protein evolution focused on the rates of amino acid substitutions. More recently, with the availability of sequence data and advanced experimental techniques, the focus of attention has shifted toward the study of evolutionary trajectories and the overall layout of protein fitness landscapes. In this review we describe the effect of epistasis on the topology of evolutionary pathways that are likely to be found in fitness landscapes and develop a simple theory to connect the number of maladapted genotypes to the topology of fitness landscapes with epistatic interactions. Finally, we review recent studies that have probed the extent of epistatic interactions and have begun to chart the fitness landscapes in protein sequence space.


Subject(s)
Amino Acid Sequence , Genetic Fitness , Animals , Epistasis, Genetic , Humans
10.
Nature ; 490(7421): 535-8, 2012 Oct 25.
Article in English | MEDLINE | ID: mdl-23064225

ABSTRACT

The main forces directing long-term molecular evolution remain obscure. A sizable fraction of amino-acid substitutions seem to be fixed by positive selection, but it is unclear to what degree long-term protein evolution is constrained by epistasis, that is, instances when substitutions that are accepted in one genotype are deleterious in another. Here we obtain a quantitative estimate of the prevalence of epistasis in long-term protein evolution by relating data on amino-acid usage in 14 organelle proteins and 2 nuclear-encoded proteins to their rates of short-term evolution. We studied multiple alignments of at least 1,000 orthologues for each of these 16 proteins from species from a diverse phylogenetic background and found that an average site contained approximately eight different amino acids. Thus, without epistasis an average site should accept two-fifths of all possible amino acids, and the average rate of amino-acid substitutions should therefore be about three-fifths lower than the rate of neutral evolution. However, we found that the measured rate of amino-acid substitution in recent evolution is 20 times lower than the rate of neutral evolution and an order of magnitude lower than that expected in the absence of epistasis. These data indicate that epistasis is pervasive throughout protein evolution: about 90 per cent of all amino-acid substitutions have a neutral or beneficial impact only in the genetic backgrounds in which they occur, and must therefore be deleterious in a different background of other species. Our findings show that most amino-acid substitutions have different fitness effects in different species and that epistasis provides the primary conceptual framework to describe the tempo and mode of long-term protein evolution.


Subject(s)
Epistasis, Genetic/genetics , Evolution, Molecular , Amino Acid Substitution/genetics , Animals , Cell Nucleus/genetics , Computational Biology , Genetic Fitness , Genotype , Models, Genetic , Mutation , Organelles/genetics , Phylogeny , Proteins/chemistry , Proteins/genetics , Sequence Alignment , Species Specificity
12.
Proc Natl Acad Sci U S A ; 112(30): 9328-33, 2015 Jul 28.
Article in English | MEDLINE | ID: mdl-26170332

ABSTRACT

Proteases play important roles in many biologic processes and are key mediators of cancer, inflammation, and thrombosis. However, comprehensive and quantitative techniques to define the substrate specificity profile of proteases are lacking. The metalloprotease ADAMTS13 regulates blood coagulation by cleaving von Willebrand factor (VWF), reducing its procoagulant activity. A mutagenized substrate phage display library based on a 73-amino acid fragment of VWF was constructed, and the ADAMTS13-dependent change in library complexity was evaluated over reaction time points, using high-throughput sequencing. Reaction rate constants (kcat/KM) were calculated for nearly every possible single amino acid substitution within this fragment. This massively parallel enzyme kinetics analysis detailed the specificity of ADAMTS13 and demonstrated the critical importance of the P1-P1' substrate residues while defining exosite binding domains. These data provided empirical evidence for the propensity for epistasis within VWF and showed strong correlation to conservation across orthologs, highlighting evolutionary selective pressures for VWF.


Subject(s)
ADAM Proteins/chemistry , High-Throughput Nucleotide Sequencing/methods , ADAMTS13 Protein , Amino Acid Sequence , Binding Sites/genetics , Blood Coagulation , Cloning, Molecular , Epistasis, Genetic , Humans , Kinetics , Molecular Sequence Data , Mutagenesis , Mutation , Peptide Library , Protein Binding/genetics , Proteolysis , Substrate Specificity , von Willebrand Factor/chemistry
13.
Mol Biol Evol ; 32(2): 542-54, 2015 Feb.
Article in English | MEDLINE | ID: mdl-25415964

ABSTRACT

The nature of factors governing the tempo and mode of protein evolution is a fundamental issue in evolutionary biology. Specifically, whether or not interactions between different sites, or epistasis, are important in directing the course of evolution became one of the central questions. Several recent reports have scrutinized patterns of long-term protein evolution claiming them to be compatible only with an epistatic fitness landscape. However, these claims have not yet been substantiated with a formal model of protein evolution. Here, we formulate a simple covarion-like model of protein evolution focusing on the rate at which the fitness impact of amino acids at a site changes with time. We then apply the model to the data on convergent and divergent protein evolution to test whether or not the incorporation of epistatic interactions is necessary to explain the data. We find that convergent evolution cannot be explained without the incorporation of epistasis and the rate at which an amino acid state switches from being acceptable at a site to being deleterious is faster than the rate of amino acid substitution. Specifically, for proteins that have persisted in modern prokaryotic organisms since the last universal common ancestor for one amino acid substitution approximately ten amino acid states switch from being accessible to being deleterious, or vice versa. Thus, molecular evolution can only be perceived in the context of rapid turnover of which amino acids are available for evolution.


Subject(s)
Evolution, Molecular , Proteins/genetics , Models, Genetic , Phylogeny , Proteins/classification
14.
Nat Rev Genet ; 11(2): 97-108, 2010 Feb.
Article in English | MEDLINE | ID: mdl-20051986

ABSTRACT

Gene duplications and their subsequent divergence play an important part in the evolution of novel gene functions. Several models for the emergence, maintenance and evolution of gene copies have been proposed. However, a clear consensus on how gene duplications are fixed and maintained in genomes is lacking. Here, we present a comprehensive classification of the models that are relevant to all stages of the evolution of gene duplications. Each model predicts a unique combination of evolutionary dynamics and functional properties. Setting out these predictions is an important step towards identifying the main mechanisms that are involved in the evolution of gene duplications.


Subject(s)
Evolution, Molecular , Gene Duplication , Models, Genetic , Animals , Humans , Phylogeny , Polymorphism, Genetic
16.
Nature ; 465(7300): 922-6, 2010 Jun 17.
Article in English | MEDLINE | ID: mdl-20485343

ABSTRACT

The need to maintain the structural and functional integrity of an evolving protein severely restricts the repertoire of acceptable amino-acid substitutions. However, it is not known whether these restrictions impose a global limit on how far homologous protein sequences can diverge from each other. Here we explore the limits of protein evolution using sequence divergence data. We formulate a computational approach to study the rate of divergence of distant protein sequences and measure this rate for ancient proteins, those that were present in the last universal common ancestor. We show that ancient proteins are still diverging from each other, indicating an ongoing expansion of the protein sequence universe. The slow rate of this divergence is imposed by the sparseness of functional protein sequences in sequence space and the ruggedness of the protein fitness landscape: approximately 98 per cent of sites cannot accept an amino-acid substitution at any given moment but a vast majority of all sites may eventually be permitted to evolve when other, compensatory, changes occur. Thus, approximately 3.5 x 10(9) yr has not been enough to reach the limit of divergent evolution of proteins, and for most proteins the limit of sequence similarity imposed by common function may not exceed that of random sequences.


Subject(s)
Evolution, Molecular , Genetic Variation , Proteins/chemistry , Amino Acid Sequence , Amino Acid Substitution , Amino Acids/chemistry , Molecular Sequence Data , Mutation , Prokaryotic Cells , Protein Structure, Secondary , Selection, Genetic/genetics , Sequence Analysis, Protein , Sequence Homology, Amino Acid
18.
Nature ; 464(7286): 279-82, 2010 Mar 11.
Article in English | MEDLINE | ID: mdl-20182427

ABSTRACT

A long-standing controversy in evolutionary biology is whether or not evolving lineages can cross valleys on the fitness landscape that correspond to low-fitness genotypes, which can eventually enable them to reach isolated fitness peaks. Here we study the fitness landscapes traversed by switches between different AU and GC Watson-Crick nucleotide pairs at complementary sites of mitochondrial transfer RNA stem regions in 83 mammalian species. We find that such Watson-Crick switches occur 30-40 times more slowly than pairs of neutral substitutions, and that alleles corresponding to GU and AC non-Watson-Crick intermediate states segregate within human populations at low frequencies, similar to those of non-synonymous alleles. Substitutions leading to a Watson-Crick switch are strongly correlated, especially in mitochondrial tRNAs encoded on the GT-nucleotide-rich strand of the mitochondrial genome. Using these data we estimate that a typical Watson-Crick switch involves crossing a fitness valley of a depth of about 10(-3) or even about 10(-2), with AC intermediates being slightly more deleterious than GU intermediates. This compensatory evolution must proceed through rare intermediate variants that never reach fixation. The ubiquitous nature of compensatory evolution in mammalian mitochondrial tRNAs and other molecules implies that simultaneous fixation of two alleles that are individually deleterious may be a common phenomenon at the molecular level.


Subject(s)
Evolution, Molecular , Mammals/physiology , RNA, Transfer/genetics , RNA/genetics , Animals , Humans , Mammals/genetics , Mutation/genetics , Polymorphism, Genetic , RNA, Mitochondrial
19.
Mol Biol Evol ; 31(11): 3016-25, 2014 Nov.
Article in English | MEDLINE | ID: mdl-25135947

ABSTRACT

Recombination between double-stranded DNA molecules is a key genetic process which occurs in a wide variety of organisms. Usually, crossing-over (CO) occurs during meiosis between genotypes with 98.0-99.9% sequence identity, because within-population nucleotide diversity only rarely exceeds 2%. However, some species are hypervariable and it is unclear how CO can occur between genotypes with less than 90% sequence identity. Here, we study CO in Schizophyllum commune, a hypervariable cosmopolitan basidiomycete mushroom, a frequently encountered decayer of woody substrates. We crossed two haploid individuals, from the United States and from Russia, and obtained genome sequences for their 17 offspring. The average genetic distance between the parents was 14%, making it possible to study CO at very high resolution. We found reduced levels of linkage disequilibrium between loci flanking the CO sites indicating that they are mostly confined to hotspots of recombination. Furthermore, CO events preferentially occurred in regions under stronger negative selection, in particular within exons that showed reduced levels of nucleotide diversity. Apparently, in hypervariable species CO must avoid regions of higher divergence between the recombining genomes due to limitations imposed by the mismatch repair system, with regions under strong negative selection providing the opportunity for recombination. These patterns are opposite to those observed in a number of less variable species indicating that population genomics of hypervariable species may reveal novel biological phenomena.


Subject(s)
Crossing Over, Genetic , DNA/genetics , Genetic Variation , Schizophyllum/genetics , Base Composition , Base Pairing , Crosses, Genetic , DNA/chemistry , Genetic Loci , Haploidy , Linkage Disequilibrium , Selection, Genetic
20.
Evolution ; 2024 Jul 11.
Article in English | MEDLINE | ID: mdl-38990788

ABSTRACT

Vaccination is the most effective tool to control infectious diseases. However, the evolution of vaccine resistance, exemplified by vaccine-resistance in SARS-CoV-2, remains a concern. Here, we model complex vaccination strategies against a pathogen with multiple epitopes - molecules targeted by the vaccine. We found that a vaccine targeting one epitope was ineffective in preventing vaccine escape. Vaccine resistance in highly infectious pathogens was prevented by the full-epitope vaccine, that is, one targeting all available epitopes, but only when the rate of pathogen evolution was low. Strikingly, a bet-hedging strategy of random administration of vaccines targeting different epitopes was the most effective in preventing vaccine resistance in pathogens with low rate of infection and high rate of evolution. Thus, complex vaccination strategies, when biologically feasible, may be preferable to the currently used single-vaccine approaches for long-term control of disease outbreaks, especially when applied to livestock with near 100% vaccination rates.

SELECTION OF CITATIONS
SEARCH DETAIL