Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Syst Biol ; 60(2): 161-74, 2011 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-21233085

RESUMO

Most phylogenetic models of protein evolution assume that sites are independent and identically distributed. Interactions between sites are ignored, and the likelihood can be conveniently calculated as the product of the individual site likelihoods. The calculation considers all possible transition paths (also called substitution histories or mappings) that are consistent with the observed states at the terminals, and the probability density of any particular reconstruction depends on the substitution model. The likelihood is the integral of the probability density of each substitution history taken over all possible histories that are consistent with the observed data. We investigated the extent to which transition paths that are incompatible with a protein's three-dimensional structure contribute to the likelihood. Several empirical amino acid models were tested for sequence pairs of different degrees of divergence. When simulating substitutional histories starting from a real sequence, the structural integrity of the simulated sequences quickly disintegrated. This result indicates that simple models are clearly unable to capture the constraints on sequence evolution. However, when we sampled transition paths between real sequences from the posterior probability distribution according to these same models, we found that the sampled histories were largely consistent with the tertiary structure. This suggests that simple empirical substitution models may be adequate for interpolating changes between observed sequences during phylogenetic inference despite the fact that the models cannot predict the effects of structural constraints from first principles. This study is significant because it provides a quantitative assessment of the biological realism of substitution models from the perspective of protein structure, and it provides insight on the prospects for improving models of protein sequence evolution.


Assuntos
Evolução Molecular , Proteínas/química , Proteínas/genética , Animais , Humanos , Funções Verossimilhança , Filogenia , Probabilidade
2.
PLoS Biol ; 6(8): e206, 2008 Aug 26.
Artigo em Inglês | MEDLINE | ID: mdl-18752347

RESUMO

Inosine monophosphate dehydrogenase (IMPDH) catalyzes an essential step in the biosynthesis of guanine nucleotides. This reaction involves two different chemical transformations, an NAD-linked redox reaction and a hydrolase reaction, that utilize mutually exclusive protein conformations with distinct catalytic residues. How did Nature construct such a complicated catalyst? Here we employ a "Wang-Landau" metadynamics algorithm in hybrid quantum mechanical/molecular mechanical (QM/MM) simulations to investigate the mechanism of the hydrolase reaction. These simulations show that the lowest energy pathway utilizes Arg418 as the base that activates water, in remarkable agreement with previous experiments. Surprisingly, the simulations also reveal a second pathway for water activation involving a proton relay from Thr321 to Glu431. The energy barrier for the Thr321 pathway is similar to the barrier observed experimentally when Arg418 is removed by mutation. The Thr321 pathway dominates at low pH when Arg418 is protonated, which predicts that the substitution of Glu431 with Gln will shift the pH-rate profile to the right. This prediction is confirmed in subsequent experiments. Phylogenetic analysis suggests that the Thr321 pathway was present in the ancestral enzyme, but was lost when the eukaryotic lineage diverged. We propose that the primordial IMPDH utilized the Thr321 pathway exclusively, and that this mechanism became obsolete when the more sophisticated catalytic machinery of the Arg418 pathway was installed. Thus, our simulations provide an unanticipated window into the evolution of a complex enzyme.


Assuntos
Aminoácidos/metabolismo , IMP Desidrogenase/química , Modelos Biológicos , Água/metabolismo , Substituição de Aminoácidos , Catálise , Simulação por Computador , Hidrolases/metabolismo , IMP Desidrogenase/metabolismo , Filogenia , Teoria Quântica , Termodinâmica
3.
Evol Bioinform Online ; 11: 85-96, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-25987828

RESUMO

Models of protein evolution tend to ignore functional constraints, although structural constraints are sometimes incorporated. Here we propose a probabilistic framework for codon substitution that evaluates joint effects of relative solvent accessibility (RSA), a structural constraint; and gene expression, a functional constraint. First, we explore the relationship between RSA and codon usage at the genomic scale as well as at the individual gene scale. Motivated by these results, we construct our framework by determining how probable is an amino acid, given RSA and gene expression, and then evaluating the relative probability of observing a codon compared to other synonymous codons. We come to the biologically plausible conclusion that both RSA and gene expression are related to amino acid frequencies, but, among synonymous codons, the relative probability of a particular codon is more closely related to gene expression than RSA. To illustrate the potential applications of our framework, we propose a new codon substitution model. Using this model, we obtain estimates of 2N s, the product of effective population size N, and relative fitness difference of allele s. For a training data set consisting of human proteins with known structures and expression data, 2N s is estimated separately for synonymous and nonsynonymous substitutions in each protein. We then contrast the patterns of synonymous and nonsynonymous 2N s estimates across proteins while also taking gene expression levels of the proteins into account. We conclude that our 2N s estimates are too concentrated around 0, and we discuss potential explanations for this lack of variability.

4.
Protein Sci ; 21(6): 769-85, 2012 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-22528593

RESUMO

Abstract The interface of protein structural biology, protein biophysics, molecular evolution, and molecular population genetics forms the foundations for a mechanistic understanding of many aspects of protein biochemistry. Current efforts in interdisciplinary protein modeling are in their infancy and the state-of-the art of such models is described. Beyond the relationship between amino acid substitution and static protein structure, protein function, and corresponding organismal fitness, other considerations are also discussed. More complex mutational processes such as insertion and deletion and domain rearrangements and even circular permutations should be evaluated. The role of intrinsically disordered proteins is still controversial, but may be increasingly important to consider. Protein geometry and protein dynamics as a deviation from static considerations of protein structure are also important. Protein expression level is known to be a major determinant of evolutionary rate and several considerations including selection at the mRNA level and the role of interaction specificity are discussed. Lastly, the relationship between modeling and needed high-throughput experimental data as well as experimental examination of protein evolution using ancestral sequence resurrection and in vitro biochemistry are presented, towards an aim of ultimately generating better models for biological inference and prediction.


Assuntos
Evolução Molecular , Proteínas/química , Proteínas/genética , Sequência de Aminoácidos , Animais , Humanos , Modelos Moleculares , Dados de Sequência Molecular , Conformação Proteica , Dobramento de Proteína , RNA Mensageiro/genética , Alinhamento de Sequência
5.
Philos Trans R Soc Lond B Biol Sci ; 363(1512): 3941-53, 2008 Dec 27.
Artigo em Inglês | MEDLINE | ID: mdl-18852098

RESUMO

Models of amino acid substitution present challenges beyond those often faced with the analysis of DNA sequences. The alignments of amino acid sequences are often small, whereas the number of parameters to be estimated is potentially large when compared with the number of free parameters for nucleotide substitution models. Most approaches to the analysis of amino acid alignments have focused on the use of fixed amino acid models in which all of the potentially free parameters are fixed to values estimated from a large number of sequences. Often, these fixed amino acid models are specific to a gene or taxonomic group (e.g. the Mtmam model, which has parameters that are specific to mammalian mitochondrial gene sequences). Although the fixed amino acid models succeed in reducing the number of free parameters to be estimated--indeed, they reduce the number of free parameters from approximately 200 to 0--it is possible that none of the currently available fixed amino acid models is appropriate for a specific alignment. Here, we present four approaches to the analysis of amino acid sequences. First, we explore the use of a general time reversible model of amino acid substitution using a Dirichlet prior probability distribution on the 190 exchangeability parameters. Second, we then explore the behaviour of prior probability distributions that are'centred' on the rates specified by the fixed amino acid model. Third, we consider a mixture of fixed amino acid models. Finally, we consider constraints on the exchangeability parameters as partitions,similar to how nucleotide substitution models are specified, and place a Dirichlet process prior model on all the possible partitioning schemes.


Assuntos
Algoritmos , Substituição de Aminoácidos/genética , Evolução Molecular , Modelos Genéticos , Alinhamento de Sequência/métodos , Teorema de Bayes , Simulação por Computador
6.
Syst Biol ; 57(1): 86-103, 2008 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-18278678

RESUMO

The main limiting factor in Bayesian MCMC analysis of phylogeny is typically the efficiency with which topology proposals sample tree space. Here we evaluate the performance of seven different proposal mechanisms, including most of those used in current Bayesian phylogenetics software. We sampled 12 empirical nucleotide data sets--ranging in size from 27 to 71 taxa and from 378 to 2,520 sites--under difficult conditions: short runs, no Metropolis-coupling, and an oversimplified substitution model producing difficult tree spaces (Jukes Cantor with equal site rates). Convergence was assessed by comparison to reference samples obtained from multiple Metropolis-coupled runs. We find that proposals producing topology changes as a side effect of branch length changes (LOCAL and Continuous Change) consistently perform worse than those involving stochastic branch rearrangements (nearest neighbor interchange, subtree pruning and regrafting, tree bisection and reconnection, or subtree swapping). Among the latter, moves that use an extension mechanism to mix local with more distant rearrangements show better overall performance than those involving only local or only random rearrangements. Moves with only local rearrangements tend to mix well but have long burn-in periods, whereas moves with random rearrangements often show the reverse pattern. Combinations of moves tend to perform better than single moves. The time to convergence can be shortened considerably by starting with a good tree, but this comes at the cost of compromising convergence diagnostics based on overdispersed starting points. Our results have important implications for developers of Bayesian MCMC implementations and for the large group of users of Bayesian phylogenetics software.


Assuntos
Modelos Genéticos , Filogenia , Teorema de Bayes , Cadeias de Markov , Método de Monte Carlo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA