Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 27
Filtrar
Más filtros












Base de datos
Intervalo de año de publicación
1.
Cell Genom ; 3(7): 100340, 2023 Jul 12.
Artículo en Inglés | MEDLINE | ID: mdl-37492101

RESUMEN

Pediatric brain and spinal cancers are collectively the leading disease-related cause of death in children; thus, we urgently need curative therapeutic strategies for these tumors. To accelerate such discoveries, the Children's Brain Tumor Network (CBTN) and Pacific Pediatric Neuro-Oncology Consortium (PNOC) created a systematic process for tumor biobanking, model generation, and sequencing with immediate access to harmonized data. We leverage these data to establish OpenPBTA, an open collaborative project with over 40 scalable analysis modules that genomically characterize 1,074 pediatric brain tumors. Transcriptomic classification reveals universal TP53 dysregulation in mismatch repair-deficient hypermutant high-grade gliomas and TP53 loss as a significant marker for poor overall survival in ependymomas and H3 K28-mutant diffuse midline gliomas. Already being actively applied to other pediatric cancers and PNOC molecular tumor board decision-making, OpenPBTA is an invaluable resource to the pediatric oncology community.

2.
Life (Basel) ; 12(7)2022 Jun 24.
Artículo en Inglés | MEDLINE | ID: mdl-35888041

RESUMEN

The geosphere of primitive Earth was the source of life's essential building blocks, and the geochemical interactions among chemical elements can inform the origins of biological roles of each element. Minerals provide a record of the fundamental properties that each chemical element contributes to crustal composition, evolution, and subsequent biological utilization. In this study, we investigate correlations between the mineral species and bulk crustal composition of each chemical element. There are statistically significant correlations between the number of elements that each element forms minerals with (#-mineral-elements) and the log of the number of mineral species that each element occurs in, and between #-mineral-elements and the log of the number of mineral localities of that element. There is a lesser correlation between the log of the crustal percentage of each element and #-mineral-elements. In the crustal percentage vs. #-mineral-elements plot, positive outliers have either important biological roles (S, Cu) or toxic biological impacts (Pb, As), while negative outliers have no biological importance (Sc, Ga, Br, Yb). In particular, S is an important bridge element between organic (e.g., amino acids) and inorganic (metal cofactors) biological components. While C and N rarely form minerals together, the two elements commonly form minerals with H, which coincides with the role of H as an electron donor/carrier in biological nitrogen and carbon fixation. Both abundant crustal percentage vs. #-mineral-elements insiders (elements that follow the correlation) and less abundant outsiders (positive outliers from the correlation) have important biological functions as essential structural elements and catalytic cofactors.

3.
Sci Rep ; 12(1): 4956, 2022 Mar 23.
Artículo en Inglés | MEDLINE | ID: mdl-35322071

RESUMEN

Earth surface redox conditions are intimately linked to the co-evolution of the geosphere and biosphere. Minerals provide a record of Earth's evolving surface and interior chemistry in geologic time due to many different processes (e.g. tectonic, volcanic, sedimentary, oxidative, etc.). Here, we show how the bipartite network of minerals and their shared constituent elements expanded and evolved over geologic time. To further investigate network expansion over time, we derive and apply a novel metric (weighted mineral element electronegativity coefficient of variation; wMEECV) to quantify intra-mineral electronegativity variation with respect to redox. We find that element electronegativity and hard soft acid base (HSAB) properties are central factors in mineral redox chemistry under a wide range of conditions. Global shifts in mineral element electronegativity and HSAB associations represented by wMEECV changes at 1.8 and 0.6 billion years ago align with decreased continental elevation followed by the transition from the intermediate ocean and glaciation eras to post-glaciation, increased atmospheric oxygen in the Phanerozoic, and enhanced continental weathering. Consequently, network analysis of mineral element electronegativity and HSAB properties reveal that orogenic activity, evolving redox state of the mantle, planetary oxygenation, and climatic transitions directly impacted the evolving chemical complexity of Earth's crust.

4.
BMC Ecol Evol ; 21(1): 214, 2021 11 29.
Artículo en Inglés | MEDLINE | ID: mdl-34844571

RESUMEN

BACKGROUND: Multiple sequence alignments (MSAs) represent the fundamental unit of data inputted to most comparative sequence analyses. In phylogenetic analyses in particular, errors in MSA construction have the potential to induce further errors in downstream analyses such as phylogenetic reconstruction itself, ancestral state reconstruction, and divergence time estimation. In addition to providing phylogenetic methods with an MSA to analyze, researchers must also specify a suitable evolutionary model for the given analysis. Most commonly, researchers apply relative model selection to select a model from candidate set and then provide both the MSA and the selected model as input to subsequent analyses. While the influence of MSA errors has been explored for most stages of phylogenetics pipelines, the potential effects of MSA uncertainty on the relative model selection procedure itself have not been explored. RESULTS: We assessed the consistency of relative model selection when presented with multiple perturbed versions of a given MSA. We find that while relative model selection is mostly robust to MSA uncertainty, in a substantial proportion of circumstances, relative model selection identifies distinct best-fitting models from different MSAs created from the same set of sequences. We find that this issue is more pervasive for nucleotide data compared to amino-acid data. However, we also find that it is challenging to predict whether relative model selection will be robust or sensitive to uncertainty in a given MSA. CONCLUSIONS: We find that that MSA uncertainty can affect virtually all steps of phylogenetic analysis pipelines to a greater extent than has previously been recognized, including relative model selection.


Asunto(s)
Alineación de Secuencia , Filogenia , Incertidumbre
5.
Mol Biol Evol ; 37(7): 2110-2123, 2020 07 01.
Artículo en Inglés | MEDLINE | ID: mdl-32191313

RESUMEN

It is regarded as best practice in phylogenetic reconstruction to perform relative model selection to determine an appropriate evolutionary model for the data. This procedure ranks a set of candidate models according to their goodness of fit to the data, commonly using an information theoretic criterion. Users then specify the best-ranking model for inference. Although it is often assumed that better-fitting models translate to increase accuracy, recent studies have shown that the specific model employed may not substantially affect inferences. We examine whether there is a systematic relationship between relative model fit and topological inference accuracy in protein phylogenetics, using simulations and real sequences. Simulations employed site-heterogeneous mechanistic codon models that are distinct from protein-level phylogenetic inference models, allowing us to investigate how protein models performs when they are misspecified to the data, as will be the case for any real sequence analysis. We broadly find that phylogenies inferred across models with vastly different fits to the data produce highly consistent topologies. We additionally find that all models infer similar proportions of false-positive splits, raising the possibility that all available models of protein evolution are similarly misspecified. Moreover, we find that the parameter-rich GTR (general time reversible) model, whose amino acid exchangeabilities are free parameters, performs similarly to models with fixed exchangeabilities, although the inference precision associated with GTR models was not examined. We conclude that, although relative model selection may not hinder phylogenetic analysis on protein data, it may not offer specific predictable improvements and is not a reliable proxy for accuracy.


Asunto(s)
Modelos Genéticos , Filogenia , Simulación por Computador
6.
Geobiology ; 18(2): 127-138, 2020 03.
Artículo en Inglés | MEDLINE | ID: mdl-32048807

RESUMEN

The incorporation of metal cofactors into protein active sites and/or active regions expanded the network of microbial metabolism during the Archean eon. The bioavailability of crucial metal cofactors is largely influenced by earth surface redox state, which impacted the timing of metabolic evolution. Vanadium (V) is a unique element in geo-bio-coevolution due to its complex redox chemistry and specific biological functions. Thus, the extent of microbial V utilization potentially represents an important link between the geo- and biospheres in deep time. In this study, we used geochemical modeling and network analysis to investigate the availability and chemical speciation of V in the environment, and the emergence and changing chemistry of V-containing minerals throughout earth history. The redox state of V shifted from a more reduced V(III) state in Archean aqueous geochemistry and mineralogy to more oxidized V(IV) and V(V) states in the Proterozoic and Phanerozoic. The weathering of vanadium sulfides, vanadium alkali metal minerals, and vanadium alkaline earth metal minerals were potential sources of V to the environment and microbial utilization. Community detection analysis of the expanding V mineral network indicates tectonic and redox influence on the distribution of V mineral-forming elements. In reducing environments, energetic drivers existed for V to potentially be involved in early nitrogen fixation, while in oxidizing environments vanadate ( VO43-]]> ) could have acted as a metabolic electron acceptor and phosphate mimicking enzyme inhibitor. The coevolving chemical speciation and biological functions of V due to earth's changing surface redox conditions demonstrate the crucial links between the geosphere and biosphere in the evolution of metabolic electron transfer pathways and biogeochemical cycles from the Archean to Phanerozoic.


Asunto(s)
Vanadio/química , Disponibilidad Biológica , Planeta Tierra , Oxidación-Reducción , Agua
7.
Mol Biol Evol ; 37(1): 295-299, 2020 Jan 01.
Artículo en Inglés | MEDLINE | ID: mdl-31504749

RESUMEN

HYpothesis testing using PHYlogenies (HyPhy) is a scriptable, open-source package for fitting a broad range of evolutionary models to multiple sequence alignments, and for conducting subsequent parameter estimation and hypothesis testing, primarily in the maximum likelihood statistical framework. It has become a popular choice for characterizing various aspects of the evolutionary process: natural selection, evolutionary rates, recombination, and coevolution. The 2.5 release (available from www.hyphy.org) includes a completely re-engineered computational core and analysis library that introduces new classes of evolutionary models and statistical tests, delivers substantial performance and stability enhancements, improves usability, streamlines end-to-end analysis workflows, makes it easier to develop custom analyses, and is mostly backward compatible with previous HyPhy releases.


Asunto(s)
Técnicas Genéticas , Filogenia , Programas Informáticos
8.
Methods Mol Biol ; 1910: 427-468, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-31278673

RESUMEN

Natural selection is a fundamental force shaping organismal evolution, as it both maintains function and enables adaptation and innovation. Viruses, with their typically short and largely coding genomes, experience strong and diverse selective forces, sometimes acting on timescales that can be directly measured. These selection pressures emerge from an antagonistic interplay between rapidly changing fitness requirements (immune and antiviral responses from hosts, transmission between hosts, or colonization of new host species) and functional imperatives (the ability to infect hosts or host cells and replicate within hosts). Indeed, computational methods to quantify these evolutionary forces using molecular sequence data were initially, dating back to the 1980s, applied to the study of viral pathogens. This preference largely emerged because the strong selective forces are easiest to detect in viruses, and, of course, viruses have clear biomedical relevance. Recent commoditization of affordable high-throughput sequencing has made it possible to generate truly massive genomic data sets, on which powerful and accurate methods can yield a very detailed depiction of when, where, and (sometimes) how viral pathogens respond to various selective forces.Here, we present recent statistical developments and state-of-the-art methods to identify and characterize these selection pressures from protein-coding sequence alignments and phylogenies. Methods described here can reveal critical information about various evolutionary regimes, including whole-gene selection, lineage-specific selection, and site-specific selection acting upon viral genomes, while accounting for confounding biological processes, such as recombination and variation in mutation rates.


Asunto(s)
Evolución Molecular , Genoma Viral , Genómica , Virus/genética , Codón , Biología Computacional/métodos , Variación Genética , Genómica/métodos , Filogenia , Recombinación Genética , Selección Genética , Programas Informáticos , Virus/clasificación
9.
Evolution ; 72(10): 2234-2243, 2018 10.
Artículo en Inglés | MEDLINE | ID: mdl-30152871

RESUMEN

Viral gain-of-function mutations frequently evolve during laboratory experiments. Whether the specific mutations that evolve in the lab also evolve in nature and whether they have the same impact on evolution in the real world is unknown. We studied a model virus, bacteriophage λ, that repeatedly evolves to exploit a new host receptor under typical laboratory conditions. Here, we demonstrate that two residues of λ's J protein are required for the new function. In natural λ variants, these amino acid sites are highly diverse and evolve at high rates. Insertions and deletions at these locations are associated with phylogenetic patterns indicative of ecological diversification. Our results show that viral evolution in the laboratory mirrors that in nature and that laboratory experiments can be coupled with protein sequence analyses to identify the causes of viral evolution in the real world. Furthermore, our results provide evidence for widespread host-shift evolution in lambdoid viruses.


Asunto(s)
Bacteriófago lambda/genética , Evolución Molecular , Mutación con Ganancia de Función/genética , Selección Genética , Filogenia
10.
Mol Biol Evol ; 35(9): 2307-2317, 2018 09 01.
Artículo en Inglés | MEDLINE | ID: mdl-29924340

RESUMEN

The relative evolutionary rates at individual sites in proteins are informative measures of conservation or adaptation. Often used as evolutionarily aware conservation scores, relative rates reveal key functional or strongly selected residues. Estimating rates in a phylogenetic context requires specifying a protein substitution model, which is typically a phenomenological model trained on a large empirical data set. A strong emphasis has traditionally been placed on selecting the "best-fit" model, with the implicit understanding that suboptimal or otherwise ill-fitting models might bias inferences. However, the pervasiveness and degree of such bias has not been systematically examined. We investigated how model choice impacts site-wise relative rates in a large set of empirical protein alignments. We compared models designed for use on any general protein, models designed for specific domains of life, and the simple equal-rates Jukes Cantor-style model (JC). As expected, information theoretic measures showed overwhelming evidence that some models fit the data decidedly better than others. By contrast, estimates of site-specific evolutionary rates were impressively insensitive to the substitution model used, revealing an unexpected degree of robustness to potential model misspecification. A deeper examination of the fewer than 5% of sites for which model inferences differed in a meaningful way showed that the JC model could uniquely identify rapidly evolving sites that models with empirically derived exchangeabilities failed to detect. We conclude that relative protein rates appear robust to the applied substitution model, and any sensible model of protein evolution, regardless of its fit to the data, should produce broadly consistent evolutionary rates.


Asunto(s)
Evolución Molecular , Técnicas Genéticas , Modelos Genéticos , Proteínas/genética
11.
PeerJ ; 6: e4339, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-29423346

RESUMEN

We introduce LEISR (Likehood Estimation of Individual Site Rates, pronounced "laser"), a tool to infer relative evolutionary rates from protein and nucleotide data, implemented in HyPhy. LEISR is based on the popular Rate4Site (Pupko et al., 2002) approach for inferring relative site-wise evolutionary rates, primarily from protein data. We extend the original method for more general use in several key ways: (i) we increase the support for nucleotide data with additional models, (ii) we allow for datasets of arbitrary size, (iii) we support analysis of site-partitioned datasets to correct for the presence of recombination breakpoints, (iv) we produce rate estimates at all sites rather than at just a subset of sites, and (v) we implemented LEISR as MPI-enabled to support rapid, high-throughput analysis. LEISR is available in HyPhy starting with version 2.3.8, and it is accessible as an option in the HyPhy analysis menu ("Relative evolutionary rate inference"), which calls the HyPhy batchfile LEISR.bf.

12.
Mol Biol Evol ; 35(3): 773-777, 2018 Mar 01.
Artículo en Inglés | MEDLINE | ID: mdl-29301006

RESUMEN

Inference of how evolutionary forces have shaped extant genetic diversity is a cornerstone of modern comparative sequence analysis. Advances in sequence generation and increased statistical sophistication of relevant methods now allow researchers to extract ever more evolutionary signal from the data, albeit at an increased computational cost. Here, we announce the release of Datamonkey 2.0, a completely re-engineered version of the Datamonkey web-server for analyzing evolutionary signatures in sequence data. For this endeavor, we leveraged recent developments in open-source libraries that facilitate interactive, robust, and scalable web application development. Datamonkey 2.0 provides a carefully curated collection of methods for interrogating coding-sequence alignments for imprints of natural selection, packaged as a responsive (i.e. can be viewed on tablet and mobile devices), fully interactive, and API-enabled web application. To complement Datamonkey 2.0, we additionally release HyPhy Vision, an accompanying JavaScript application for visualizing analysis results. HyPhy Vision can also be used separately from Datamonkey 2.0 to visualize locally executed HyPhy analyses. Together, Datamonkey 2.0 and HyPhy Vision showcase how scientific software development can benefit from general-purpose open-source frameworks. Datamonkey 2.0 is freely and publicly available at http://www.datamonkey.org, and the underlying codebase is available from https://github.com/veg/datamonkey-js.

13.
PLoS One ; 12(4): e0164905, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-28369116

RESUMEN

Proteins evolve through two primary mechanisms: substitution, where mutations alter a protein's amino-acid sequence, and insertions and deletions (indels), where amino acids are either added to or removed from the sequence. Protein structure has been shown to influence the rate at which substitutions accumulate across sites in proteins, but whether structure similarly constrains the occurrence of indels has not been rigorously studied. Here, we investigate the extent to which structural properties known to covary with protein evolutionary rates might also predict protein tolerance to indels. Specifically, we analyze a publicly available dataset of single-amino-acid deletion mutations in enhanced green fluorescent protein (eGFP) to assess how well the functional effect of deletions can be predicted from protein structure. We find that weighted contact number (WCN), which measures how densely packed a residue is within the protein's three-dimensional structure, provides the best single predictor for whether eGFP will tolerate a given deletion. We additionally find that using protein design to explicitly model deletions results in improved predictions of functional status when combined with other structural predictors. Our work suggests that structure plays fundamental role in constraining deletions at sites in proteins, and further that similar biophysical constraints influence both substitutions and deletions. This study therefore provides a solid foundation for future work to examine how protein structure influences tolerance of more complex indel events, such as insertions or large deletions.


Asunto(s)
Proteínas Fluorescentes Verdes/química , Proteínas Fluorescentes Verdes/genética , Secuencia de Aminoácidos , Evolución Molecular Dirigida , Fluorescencia , Mutación INDEL , Modelos Logísticos , Modelos Moleculares , Estructura Secundaria de Proteína , Eliminación de Secuencia , Máquina de Vectores de Soporte
14.
F1000Res ; 6: 1845, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-29167739

RESUMEN

We describe how to measure site-specific rates of evolution in protein-coding genes and how to correlate these rates with structural features of the expressed protein, such as relative solvent accessibility, secondary structure, or weighted contact number. We present two alternative approaches to rate calculations: One based on relative amino-acid rates, and the other based on site-specific codon rates measured as dN/ dS. We additionally provide a code repository containing scripts to facilitate the specific analysis protocols we recommend.

15.
J Cell Biol ; 216(1): 167-179, 2017 Jan 02.
Artículo en Inglés | MEDLINE | ID: mdl-28003333

RESUMEN

The critical initiation phase of clathrin-mediated endocytosis (CME) determines where and when endocytosis occurs. Heterotetrameric adaptor protein 2 (AP2) complexes, which initiate clathrin-coated pit (CCP) assembly, are activated by conformational changes in response to phosphatidylinositol-4,5-bisphosphate (PIP2) and cargo binding at multiple sites. However, the functional hierarchy of interactions and how these conformational changes relate to distinct steps in CCP formation in living cells remains unknown. We used quantitative live-cell analyses to measure discrete early stages of CME and show how sequential, allosterically regulated conformational changes activate AP2 to drive both nucleation and subsequent stabilization of nascent CCPs. Our data establish that cargoes containing Yxxφ motif, but not dileucine motif, play a critical role in the earliest stages of AP2 activation and CCP nucleation. Interestingly, these cargo and PIP2 interactions are not conserved in yeast. Thus, we speculate that AP2 has evolved as a key regulatory node to coordinate CCP formation and cargo sorting and ensure high spatial and temporal regulation of CME.


Asunto(s)
Complejo 2 de Proteína Adaptadora/metabolismo , Vesículas Cubiertas por Clatrina/metabolismo , Clatrina/metabolismo , Invaginaciones Cubiertas de la Membrana Celular/metabolismo , Endocitosis , Epitelio Pigmentado de la Retina/metabolismo , Complejo 2 de Proteína Adaptadora/química , Complejo 2 de Proteína Adaptadora/genética , Subunidades alfa de Complejo de Proteína Adaptadora/genética , Subunidades alfa de Complejo de Proteína Adaptadora/metabolismo , Subunidades mu de Complejo de Proteína Adaptadora/genética , Subunidades mu de Complejo de Proteína Adaptadora/metabolismo , Secuencias de Aminoácidos , Línea Celular , Humanos , Fosfatidilinositol 4,5-Difosfato/metabolismo , Unión Proteica , Conformación Proteica , Proteínas Serina-Treonina Quinasas/metabolismo , Estabilidad Proteica , Transporte de Proteínas , Interferencia de ARN , Transducción de Señal , Relación Estructura-Actividad , Factores de Tiempo , Transfección
16.
Genetics ; 204(2): 499-511, 2016 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-27535929

RESUMEN

Two broad paradigms exist for inferring [Formula: see text] the ratio of nonsynonymous to synonymous substitution rates, from coding sequences: (i) a one-rate approach, where [Formula: see text] is represented with a single parameter, or (ii) a two-rate approach, where [Formula: see text] and [Formula: see text] are estimated separately. The performances of these two approaches have been well studied in the specific context of proper model specification, i.e., when the inference model matches the simulation model. By contrast, the relative performances of one-rate vs. two-rate parameterizations when applied to data generated according to a different mechanism remain unclear. Here, we compare the relative merits of one-rate and two-rate approaches in the specific context of model misspecification by simulating alignments with mutation-selection models rather than with [Formula: see text]-based models. We find that one-rate frameworks generally infer more accurate [Formula: see text] point estimates, even when [Formula: see text] varies among sites. In other words, modeling [Formula: see text] variation may substantially reduce accuracy of [Formula: see text] point estimates. These results appear to depend on the selective constraint operating at a given site. For sites under strong purifying selection ([Formula: see text]), one-rate and two-rate models show comparable performances. However, one-rate models significantly outperform two-rate models for sites under moderate-to-weak purifying selection. We attribute this distinction to the fact that, for these more quickly evolving sites, a given substitution is more likely to be nonsynonymous than synonymous. The data will therefore be relatively enriched for nonsynonymous changes, and modeling [Formula: see text] contributes excessive noise to [Formula: see text] estimates. We additionally find that high levels of divergence among sequences, rather than the number of sequences in the alignment, are more critical for obtaining precise point estimates.


Asunto(s)
Sustitución de Aminoácidos/genética , Evolución Molecular , Filogenia , Selección Genética , Simulación por Computador/estadística & datos numéricos , Variación Genética , Mutación/genética , Alineación de Secuencia/estadística & datos numéricos
17.
Mol Biol Evol ; 33(11): 2990-3002, 2016 11.
Artículo en Inglés | MEDLINE | ID: mdl-27512115

RESUMEN

The mutation-selection model of coding sequence evolution has received renewed attention for its use in estimating site-specific amino acid propensities and selection coefficient distributions. Two computationally tractable mutation-selection inference frameworks have been introduced: One framework employs a fixed-effects, highly parameterized maximum likelihood approach, whereas the other employs a random-effects Bayesian Dirichlet Process approach. While both implementations follow the same model, they appear to make distinct predictions about the distribution of selection coefficients. The fixed-effects framework estimates a large proportion of highly deleterious substitutions, whereas the random-effects framework estimates that all substitutions are either nearly neutral or weakly deleterious. It remains unknown, however, how accurately each method infers evolutionary constraints at individual sites. Indeed, selection coefficient distributions pool all site-specific inferences, thereby obscuring a precise assessment of site-specific estimates. Therefore, in this study, we use a simulation-based strategy to determine how accurately each approach recapitulates the selective constraint at individual sites. We find that the fixed-effects approach, despite its extensive parameterization, consistently and accurately estimates site-specific evolutionary constraint. By contrast, the random-effects Bayesian approach systematically underestimates the strength of natural selection, particularly for slowly evolving sites. We also find that, despite the strong differences between their inferred selection coefficient distributions, the fixed- and random-effects approaches yield surprisingly similar inferences of site-specific selective constraint. We conclude that the fixed-effects mutation-selection framework provides the more reliable software platform for model application and future development.


Asunto(s)
Modelos Genéticos , Mutación , Análisis de Secuencia de ADN/métodos , Sustitución de Aminoácidos , Aminoácidos/genética , Teorema de Bayes , Evolución Molecular , Variación Genética , Funciones de Verosimilitud , Sistemas de Lectura Abierta , Filogenia , Selección Genética
18.
Protein Sci ; 25(7): 1341-53, 2016 07.
Artículo en Inglés | MEDLINE | ID: mdl-26971720

RESUMEN

Structural properties such as solvent accessibility and contact number predict site-specific sequence variability in many proteins. However, the strength and significance of these structure-sequence relationships vary widely among different proteins, with absolute correlation strengths ranging from 0 to 0.8. In particular, two recent works have made contradictory observations. Yeh et al. (Mol. Biol. Evol. 31:135-139, 2014) found that both relative solvent accessibility (RSA) and weighted contact number (WCN) are good predictors of sitewise evolutionary rate in enzymes, with WCN clearly out-performing RSA. Shahmoradi et al. (J. Mol. Evol. 79:130-142, 2014) considered these same predictors (as well as others) in viral proteins and found much weaker correlations and no clear advantage of WCN over RSA. Because these two studies had substantial methodological differences, however, a direct comparison of their results is not possible. Here, we reanalyze the datasets of the two studies with one uniform analysis pipeline, and we find that many apparent discrepancies between the two analyses can be attributed to the extent of sequence divergence in individual alignments. Specifically, the alignments of the enzyme dataset are much more diverged than those of the virus dataset, and proteins with higher divergence exhibit, on average, stronger structure-sequence correlations. However, the highest structure-sequence correlations are observed at intermediate divergence levels, where both highly conserved and highly variable sites are present in the same alignment.


Asunto(s)
Secuencia de Aminoácidos , Enzimas/química , Proteínas Virales/química , Biología Computacional/métodos , Enzimas/genética , Evolución Molecular , Modelos Moleculares , Conformación Proteica , Alineación de Secuencia , Solventes/química , Proteínas Virales/genética
19.
Nat Rev Genet ; 17(2): 109-21, 2016 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-26781812

RESUMEN

It has long been recognized that certain sites within a protein, such as sites in the protein core or catalytic residues in enzymes, are evolutionarily more conserved than other sites. However, our understanding of rate variation among sites remains surprisingly limited. Recent progress to address this includes the development of a wide array of reliable methods to estimate site-specific substitution rates from sequence alignments. In addition, several molecular traits have been identified that correlate with site-specific mutation rates, and novel mechanistic biophysical models have been proposed to explain the observed correlations. Nonetheless, current models explain, at best, approximately 60% of the observed variance, highlighting the limitations of current methods and models and the need for new research directions.


Asunto(s)
Evolución Molecular , Proteínas/genética , Biología Computacional/métodos , Proteínas/química , Proteínas/metabolismo
20.
PLoS One ; 10(9): e0139047, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-26397960

RESUMEN

We introduce Pyvolve, a flexible Python module for simulating genetic data along a phylogeny using continuous-time Markov models of sequence evolution. Easily incorporated into Python bioinformatics pipelines, Pyvolve can simulate sequences according to most standard models of nucleotide, amino-acid, and codon sequence evolution. All model parameters are fully customizable. Users can additionally specify custom evolutionary models, with custom rate matrices and/or states to evolve. This flexibility makes Pyvolve a convenient framework not only for simulating sequences under a wide variety of conditions, but also for developing and testing new evolutionary models. Pyvolve is an open-source project under a FreeBSD license, and it is available for download, along with a detailed user-manual and example scripts, from http://github.com/sjspielman/pyvolve.


Asunto(s)
Secuencia de Bases/genética , Simulación por Computador , Modelos Genéticos , Filogenia , Secuencia de Aminoácidos/genética , Animales , Genómica/métodos , Humanos , Cadenas de Markov
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...