Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 41
Filtrar
1.
PLoS Biol ; 22(5): e3002594, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38754362

RESUMO

The standard genetic code defines the rules of translation for nearly every life form on Earth. It also determines the amino acid changes accessible via single-nucleotide mutations, thus influencing protein evolvability-the ability of mutation to bring forth adaptive variation in protein function. One of the most striking features of the standard genetic code is its robustness to mutation, yet it remains an open question whether such robustness facilitates or frustrates protein evolvability. To answer this question, we use data from massively parallel sequence-to-function assays to construct and analyze 6 empirical adaptive landscapes under hundreds of thousands of rewired genetic codes, including those of codon compression schemes relevant to protein engineering and synthetic biology. We find that robust genetic codes tend to enhance protein evolvability by rendering smooth adaptive landscapes with few peaks, which are readily accessible from throughout sequence space. However, the standard genetic code is rarely exceptional in this regard, because many alternative codes render smoother landscapes than the standard code. By constructing low-dimensional visualizations of these landscapes, which each comprise more than 16 million mRNA sequences, we show that such alternative codes radically alter the topological features of the network of high-fitness genotypes. Whereas the genetic codes that optimize evolvability depend to some extent on the detailed relationship between amino acid sequence and protein function, we also uncover general design principles for engineering nonstandard genetic codes for enhanced and diminished evolvability, which may facilitate directed protein evolution experiments and the bio-containment of synthetic organisms, respectively.


Assuntos
Evolução Molecular , Código Genético , Proteínas , Proteínas/genética , Proteínas/metabolismo , Mutação/genética , Códon/genética , Modelos Genéticos , Biologia Sintética/métodos , Biossíntese de Proteínas , Engenharia de Proteínas/métodos
2.
Proc Natl Acad Sci U S A ; 119(7)2022 02 15.
Artigo em Inglês | MEDLINE | ID: mdl-35145034

RESUMO

Evolutionary adaptation often occurs by the fixation of beneficial mutations. This mode of adaptation can be characterized quantitatively by a spectrum of adaptive substitutions, i.e., a distribution for types of changes fixed in adaptation. Recent work establishes that the changes involved in adaptation reflect common types of mutations, raising the question of how strongly the mutation spectrum shapes the spectrum of adaptive substitutions. We address this question with a codon-based model for the spectrum of adaptive amino acid substitutions, applied to three large datasets covering thousands of amino acid changes identified in natural and experimental adaptation in Saccharomyces cerevisiae, Escherichia coli, and Mycobacterium tuberculosis Using species-specific mutation spectra based on prior knowledge, we find that the mutation spectrum has a proportional influence on the spectrum of adaptive substitutions in all three species. Indeed, we find that by inferring the mutation rates that best explain the spectrum of adaptive substitutions, we can accurately recover the species-specific mutation spectra. However, we also find that the predictive power of the model differs substantially between the three species. To better understand these differences, we use population simulations to explore the factors that influence how closely the spectrum of adaptive substitutions mirrors the mutation spectrum. The results show that the influence of the mutation spectrum decreases with increasing mutational supply ([Formula: see text]) and that predictive power is strongly affected by the number and diversity of beneficial mutations.


Assuntos
Adaptação Fisiológica , Escherichia coli/genética , Mycobacterium tuberculosis/genética , Saccharomyces cerevisiae/genética , Proteínas de Bactérias/genética , Proteínas de Bactérias/metabolismo , Escherichia coli/fisiologia , Proteínas Fúngicas/genética , Proteínas Fúngicas/metabolismo , Regulação Bacteriana da Expressão Gênica , Regulação Fúngica da Expressão Gênica , Mutação , Mycobacterium tuberculosis/fisiologia , Saccharomyces cerevisiae/fisiologia , Especificidade da Espécie
3.
Proc Natl Acad Sci U S A ; 119(39): e2204233119, 2022 09 27.
Artigo em Inglês | MEDLINE | ID: mdl-36129941

RESUMO

Contemporary high-throughput mutagenesis experiments are providing an increasingly detailed view of the complex patterns of genetic interaction that occur between multiple mutations within a single protein or regulatory element. By simultaneously measuring the effects of thousands of combinations of mutations, these experiments have revealed that the genotype-phenotype relationship typically reflects not only genetic interactions between pairs of sites but also higher-order interactions among larger numbers of sites. However, modeling and understanding these higher-order interactions remains challenging. Here we present a method for reconstructing sequence-to-function mappings from partially observed data that can accommodate all orders of genetic interaction. The main idea is to make predictions for unobserved genotypes that match the type and extent of epistasis found in the observed data. This information on the type and extent of epistasis can be extracted by considering how phenotypic correlations change as a function of mutational distance, which is equivalent to estimating the fraction of phenotypic variance due to each order of genetic interaction (additive, pairwise, three-way, etc.). Using these estimated variance components, we then define an empirical Bayes prior that in expectation matches the observed pattern of epistasis and reconstruct the genotype-phenotype mapping by conducting Gaussian process regression under this prior. To demonstrate the power of this approach, we present an application to the antibody-binding domain GB1 and also provide a detailed exploration of a dataset consisting of high-throughput measurements for the splicing efficiency of human pre-mRNA [Formula: see text] splice sites, for which we also validate our model predictions via additional low-throughput experiments.


Assuntos
Epistasia Genética , Precursores de RNA , Teorema de Bayes , Mapeamento Cromossômico , Biologia Computacional , Genótipo , Humanos , Modelos Genéticos , Mutação , Fenótipo , Splicing de RNA
4.
Proc Natl Acad Sci U S A ; 118(40)2021 10 05.
Artigo em Inglês | MEDLINE | ID: mdl-34599093

RESUMO

Density estimation in sequence space is a fundamental problem in machine learning that is also of great importance in computational biology. Due to the discrete nature and large dimensionality of sequence space, how best to estimate such probability distributions from a sample of observed sequences remains unclear. One common strategy for addressing this problem is to estimate the probability distribution using maximum entropy (i.e., calculating point estimates for some set of correlations based on the observed sequences and predicting the probability distribution that is as uniform as possible while still matching these point estimates). Building on recent advances in Bayesian field-theoretic density estimation, we present a generalization of this maximum entropy approach that provides greater expressivity in regions of sequence space where data are plentiful while still maintaining a conservative maximum entropy character in regions of sequence space where data are sparse or absent. In particular, we define a family of priors for probability distributions over sequence space with a single hyperparameter that controls the expected magnitude of higher-order correlations. This family of priors then results in a corresponding one-dimensional family of maximum a posteriori estimates that interpolate smoothly between the maximum entropy estimate and the observed sample frequencies. To demonstrate the power of this method, we use it to explore the high-dimensional geometry of the distribution of 5' splice sites found in the human genome and to understand patterns of chromosomal abnormalities across human cancers.


Assuntos
Aneuploidia , Biologia Computacional/métodos , Modelos Teóricos , Neoplasias/genética , Sítios de Splice de RNA , Humanos , Probabilidade
5.
Am Nat ; 202(4): 534-557, 2023 10.
Artigo em Inglês | MEDLINE | ID: mdl-37792926

RESUMO

AbstractThe joint distribution of selection coefficients and mutation rates is a key determinant of the genetic architecture of molecular adaptation. Three different distributions are of immediate interest: (1) the "nominal" distribution of possible changes, prior to mutation or selection; (2) the "de novo" distribution of realized mutations; and (3) the "fixed" distribution of selectively established mutations. Here, we formally characterize the relationships between these joint distributions under the strong-selection/weak-mutation (SSWM) regime. The de novo distribution is enriched relative to the nominal distribution for the highest rate mutations, and the fixed distribution is further enriched for the most highly beneficial mutations. Whereas mutation rates and selection coefficients are often assumed to be uncorrelated, we show that even with no correlation in the nominal distribution, the resulting de novo and fixed distributions can have correlations with any combination of signs. Nonetheless, we suggest that natural systems with a finite number of beneficial mutations will frequently have the kind of nominal distribution that induces negative correlations in the fixed distribution. We apply our mathematical framework, along with population simulations, to explore joint distributions of selection coefficients and mutation rates from deep mutational scanning and cancer informatics. Finally, we consider the evolutionary implications of these joint distributions together with two additional joint distributions relevant to parallelism and the rate of adaptation.


Assuntos
Taxa de Mutação , Seleção Genética , Modelos Genéticos , Mutação , Evolução Biológica , Evolução Molecular
6.
Annu Rev Genomics Hum Genet ; 20: 99-127, 2019 08 31.
Artigo em Inglês | MEDLINE | ID: mdl-31091417

RESUMO

Over the last decade, a rich variety of massively parallel assays have revolutionized our understanding of how biological sequences encode quantitative molecular phenotypes. These assays include deep mutational scanning, high-throughput SELEX, and massively parallel reporter assays. Here, we review these experimental methods and how the data they produce can be used to quantitatively model sequence-function relationships. In doing so, we touch on a diverse range of topics, including the identification of clinically relevant genomic variants, the modeling of transcription factor binding to DNA, the functional and evolutionary landscapes of proteins, and cis-regulatory mechanisms in both transcription and mRNA splicing. We further describe a unified conceptual framework and a core set of mathematical modeling strategies that studies in these diverse areas can make use of. Finally, we highlight key aspects of experimental design and mathematical modeling that are important for the results of such studies to be interpretable and reproducible.


Assuntos
Epistasia Genética , Estudos de Associação Genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Modelos Genéticos , Técnica de Seleção de Aptâmeros/métodos , DNA/genética , DNA/metabolismo , Genótipo , Humanos , Mutação , Fenótipo , Ligação Proteica , Splicing de RNA , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo , Transcrição Gênica
7.
Proc Natl Acad Sci U S A ; 115(32): E7550-E7558, 2018 08 07.
Artigo em Inglês | MEDLINE | ID: mdl-30037990

RESUMO

Genotype-phenotype relationships are notoriously complicated. Idiosyncratic interactions between specific combinations of mutations occur and are difficult to predict. Yet it is increasingly clear that many interactions can be understood in terms of global epistasis. That is, mutations may act additively on some underlying, unobserved trait, and this trait is then transformed via a nonlinear function to the observed phenotype as a result of subsequent biophysical and cellular processes. Here we infer the shape of such global epistasis in three proteins, based on published high-throughput mutagenesis data. To do so, we develop a maximum-likelihood inference procedure using a flexible family of monotonic nonlinear functions spanned by an I-spline basis. Our analysis uncovers dramatic nonlinearities in all three proteins; in some proteins a model with global epistasis accounts for virtually all of the measured variation, whereas in others we find substantial local epistasis as well. This method allows us to test hypotheses about the form of global epistasis and to distinguish variance components attributable to global epistasis, local epistasis, and measurement error.


Assuntos
Epistasia Genética , Evolução Molecular , Aptidão Genética , Modelos Genéticos , Genótipo , Modelos Estatísticos , Mutação , Dinâmica não Linear , Fenótipo
8.
Mol Biol Evol ; 34(9): 2163-2172, 2017 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-28645195

RESUMO

While mutational biases strongly influence neutral molecular evolution, the role of mutational biases in shaping the course of adaptation is less clear. Here we consider the frequency of transitions relative to transversions among adaptive substitutions. Because mutation rates for transitions are higher than those for transversions, if mutational biases influence the dynamics of adaptation, then transitions should be overrepresented among documented adaptive substitutions. To test this hypothesis, we assembled two sets of data on putatively adaptive amino acid replacements that have occurred in parallel during evolution, either in nature or in the laboratory. We find that the frequency of transitions in these data sets is much higher than would be predicted under a null model where mutation has no effect. Our results are qualitatively similar even if we restrict ourself to changes that have occurred, not merely twice, but three or more times. These results suggest that the course of adaptation is biased by mutation.


Assuntos
Adaptação Fisiológica/genética , Viés , Evolução Biológica , Evolução Molecular , Modelos Genéticos , Mutação/genética , Taxa de Mutação , Filogenia , Mutação Puntual/genética , Homologia de Sequência de Aminoácidos
9.
Heredity (Edinb) ; 121(5): 449-465, 2018 11.
Artigo em Inglês | MEDLINE | ID: mdl-30232363

RESUMO

Understanding evolution on complex fitness landscapes is difficult both because of the large dimensionality of sequence space and the stochasticity inherent to population-genetic processes. Here, I present an integrated suite of mathematical tools for understanding evolution on time-invariant fitness landscapes when mutations occur sufficiently rarely that the population is typically monomorphic and evolution can be modeled as a sequence of well-separated fixation events. The basic intuition behind this suite of tools is that surrounding any particular genotype lies a region of the fitness landscape that is easy to evolve to, while other pieces of the fitness landscape are difficult to evolve to (due to distance, being across a fitness valley, etc.). I propose a rigorous definition for this "dynamical neighborhood" of a genotype which captures several aspects of the distribution of waiting times to evolve from one genotype to another. The neighborhood structure of the landscape as a whole can be summarized as a matrix, and I show how this matrix can be used to approximate the expected waiting time for certain evolutionary events to occur and to provide an intuitive interpretation to existing formal results on the index of dispersion of the molecular clock.


Assuntos
Adaptação Fisiológica/genética , Aptidão Genética , Mutação , Genótipo
10.
Proc Natl Acad Sci U S A ; 112(25): E3226-35, 2015 Jun 23.
Artigo em Inglês | MEDLINE | ID: mdl-26056312

RESUMO

The phenotypic effect of an allele at one genetic site may depend on alleles at other sites, a phenomenon known as epistasis. Epistasis can profoundly influence the process of evolution in populations and shape the patterns of protein divergence across species. Whereas epistasis between adaptive substitutions has been studied extensively, relatively little is known about epistasis under purifying selection. Here we use computational models of thermodynamic stability in a ligand-binding protein to explore the structure of epistasis in simulations of protein sequence evolution. Even though the predicted effects on stability of random mutations are almost completely additive, the mutations that fix under purifying selection are enriched for epistasis. In particular, the mutations that fix are contingent on previous substitutions: Although nearly neutral at their time of fixation, these mutations would be deleterious in the absence of preceding substitutions. Conversely, substitutions under purifying selection are subsequently entrenched by epistasis with later substitutions: They become increasingly deleterious to revert over time. Our results imply that, even under purifying selection, protein sequence evolution is often contingent on history and so it cannot be predicted by the phenotypic effects of mutations assayed in the ancestral background.


Assuntos
Evolução Molecular , Proteínas/genética , Epistasia Genética , Modelos Teóricos , Mutação , Estabilidade Proteica , Termodinâmica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA