Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 37
Filter
Add more filters

Country/Region as subject
Publication year range
1.
Theor Popul Biol ; 158: 139-149, 2024 Aug.
Article in English | MEDLINE | ID: mdl-38871089

ABSTRACT

The introduction of the spatial Lambda-Fleming-Viot model (ΛV) in population genetics was mainly driven by the pioneering work of Alison Etheridge, in collaboration with Nick Barton and Amandine Véber about ten years ago (Barton et al., 2010; Barton et al., 2013). The ΛV model provides a sound mathematical framework for describing the evolution of a population of related individuals along a spatial continuum. It alleviates the "pain in the torus" issue with Wright and Malécot's isolation by distance model and is sampling consistent, making it a tool of choice for statistical inference. Yet, little is known about the potential connections between the ΛV and other stochastic processes generating trees and the spatial coordinates along the corresponding lineages. This work focuses on a version of the ΛV whereby lineages move rapidly over small distances. Using simulations, we show that the induced ΛV tree-generating process is well approximated by a birth-death model. Our results also indicate that Brownian motions modelling the movements of lines of descent along birth-death trees do not generally provide a good approximation of the ΛV due to habitat boundaries effects that play an increasingly important role in the long run. Accounting for habitat boundaries through reflected Brownian motions considerably increases the similarity to the ΛV model however. Finally, we describe efficient algorithms for fast simulation of the backward and forward in time versions of the ΛV model.


Subject(s)
Genetics, Population , Models, Genetic , Stochastic Processes
2.
Proc Natl Acad Sci U S A ; 118(52)2021 12 28.
Article in English | MEDLINE | ID: mdl-34930835

ABSTRACT

Statistical phylogeography provides useful tools to characterize and quantify the spread of organisms during the course of evolution. Analyzing georeferenced genetic data often relies on the assumption that samples are preferentially collected in densely populated areas of the habitat. Deviation from this assumption negatively impacts the inference of the spatial and demographic dynamics. This issue is pervasive in phylogeography. It affects analyses that approximate the habitat as a set of discrete demes as well as those that treat it as a continuum. The present study introduces a Bayesian modeling approach that explicitly accommodates for spatial sampling strategies. An original inference technique, based on recent advances in statistical computing, is then described that is most suited to modeling data where sequences are preferentially collected at certain locations, independently of the outcome of the evolutionary process. The analysis of georeferenced genetic sequences from the West Nile virus in North America along with simulated data shows how assumptions about spatial sampling may impact our understanding of the forces shaping biodiversity across time and space.


Subject(s)
Models, Statistical , Phylogeography/methods , Population Dynamics , Algorithms , Bayes Theorem , Ecosystem , Evolution, Molecular , Humans , North America , Spatial Analysis , West Nile Fever/epidemiology , West Nile Fever/virology , West Nile virus/genetics
3.
Theor Popul Biol ; 146: 15-28, 2022 08.
Article in English | MEDLINE | ID: mdl-35662574

ABSTRACT

We revisit the Spatial Λ-Fleming-Viot process introduced in Barton and Kelleher (2010). Particularly, we are interested in the time T0 to the most recent common ancestor for two lineages. We distinguish between the cases where the process acts on the two-dimensional plane and on a finite rectangle. Utilizing a differential equation linking T0 with the physical distance between the lineages, we arrive at computationally efficient and reasonably accurate approximation schemes for both cases. Furthermore, our analysis enables us to address the question of whether the genealogical process of the model "comes down from infinity", which has been partly answered before in Véber and Wakolbinger (2015).


Subject(s)
Genetics, Population
4.
PLoS Comput Biol ; 17(1): e1008561, 2021 01.
Article in English | MEDLINE | ID: mdl-33406072

ABSTRACT

Phylogeographic inference allows reconstruction of past geographical spread of pathogens or living organisms by integrating genetic and geographic data. A popular model in continuous phylogeography-with location data provided in the form of latitude and longitude coordinates-describes spread as a Brownian motion (Brownian Motion Phylogeography, BMP) in continuous space and time, akin to similar models of continuous trait evolution. Here, we show that reconstructions using this model can be strongly affected by sampling biases, such as the lack of sampling from certain areas. As an attempt to reduce the effects of sampling bias on BMP, we consider the addition of sequence-free samples from under-sampled areas. While this approach alleviates the effects of sampling bias, in most scenarios this will not be a viable option due to the need for prior knowledge of an outbreak's spatial distribution. We therefore consider an alternative model, the spatial Λ-Fleming-Viot process (ΛFV), which has recently gained popularity in population genetics. Despite the ΛFV's robustness to sampling biases, we find that the different assumptions of the ΛFV and BMP models result in different applicabilities, with the ΛFV being more appropriate for scenarios of endemic spread, and BMP being more appropriate for recent outbreaks or colonizations.


Subject(s)
Genetics, Population/methods , Models, Genetic , Phylogeography/methods , Selection Bias , Bayes Theorem , Computational Biology , Disease Outbreaks/statistics & numerical data , Flavivirus/genetics , Flavivirus Infections/epidemiology , Flavivirus Infections/virology , Humans , Markov Chains
5.
BMC Bioinformatics ; 22(1): 463, 2021 Sep 27.
Article in English | MEDLINE | ID: mdl-34579644

ABSTRACT

BACKGROUND: Phylogeographic reconstructions serve as a basis to understand the spread and evolution of pathogens. Visualization of these reconstructions often lead to complex graphical representations which are difficult to interpret. RESULT: We present EvoLaps, a user-friendly web interface to visualize phylogeographic reconstructions based on the analysis of latitude/longitude coordinates with various clustering levels. EvoLaps also produces transition diagrams that provide concise and easy to interpret summaries of phylogeographic reconstructions. CONCLUSION: The main contribution of EvoLaps is to assemble known numerical and graphical methods/tools into a user-friendly interface dedicated to the visualization and edition of evolutionary scenarios based on continuous phylogeographic reconstructions. EvoLaps is freely usable at www.evolaps.org .


Subject(s)
Phylogeny , Cluster Analysis , Phylogeography
6.
Syst Biol ; 67(4): 651-661, 2018 07 01.
Article in English | MEDLINE | ID: mdl-29385558

ABSTRACT

This study introduces a new Bayesian technique for molecular dating that explicitly accommodates for uncertainty in the phylogenetic position of calibrated nodes derived from the analysis of fossil data. The proposed approach thus defines an adequate framework for incorporating expert knowledge and/or prior information about the way fossils were collected in the inference of node ages. Although it belongs to the class of "node-dating" approaches, this method shares interesting properties with "tip-dating" techniques. Yet, it alleviates some of the computational and modeling difficulties that hamper tip-dating approaches. The influence of fossil data on the probabilistic distribution of trees is the crux of the matter considered here. More specifically, among all the phylogenies that a tree model (e.g., the birth-death process) generates, only a fraction of them "agree" with the fossil data. Bayesian inference under the new model requires taking this fraction into account. However, evaluating this quantity is difficult in practice. A generic solution to this issue is presented here. The proposed approach relies on a recent statistical technique, the so-called exchange algorithm, dedicated to drawing samples from "doubly intractable" distributions. A small example illustrates the problem of interest and the impact of uncertainty in the placement of calibration constraints in the phylogeny given fossil data. An analysis of land plant sequences and multiple fossils further highlights the pertinence of the proposed approach.


Subject(s)
Embryophyta/classification , Evolution, Molecular , Genetic Speciation , Models, Genetic , Bayes Theorem , Calibration , Fossils , Models, Biological
7.
Theor Popul Biol ; 111: 43-50, 2016 10.
Article in English | MEDLINE | ID: mdl-27184386

ABSTRACT

Understanding population dynamics from the analysis of molecular and spatial data requires sound statistical modeling. Current approaches assume that populations are naturally partitioned into discrete demes, thereby failing to be relevant in cases where individuals are scattered on a spatial continuum. Other models predict the formation of increasingly tight clusters of individuals in space, which, again, conflicts with biological evidence. Building on recent theoretical work, we introduce a new genealogy-based inference framework that alleviates these issues. This approach effectively implements a stochastic model in which the distribution of individuals is homogeneous and stationary, thereby providing a relevant null model for the fluctuation of genetic diversity in time and space. Importantly, the spatial density of individuals in a population and their range of dispersal during the course of evolution are two parameters that can be inferred separately with this method. The validity of the new inference framework is confirmed with extensive simulations and the analysis of influenza sequences collected over five seasons in the USA.


Subject(s)
Demography , Genealogy and Heraldry , Models, Statistical , Population Dynamics , Genetic Variation , Humans , Models, Genetic
8.
Mol Biol Evol ; 31(2): 484-95, 2014 Feb.
Article in English | MEDLINE | ID: mdl-24132121

ABSTRACT

The branch-site model is a widely popular approach that accommodates for the lineage- and the site-specific heterogeneity of natural selection regimes among coding sequences. This model relies on prior knowledge of the (foreground) lineage(s) evolving under positive selection at some sites. Unfortunately, such prior information is not always available in practice. A more recent technique (Guindon S, Rodrigo A, Dyer K, Huelsenbeck J. 2004. Modeling the site-specific variation of selection patterns along lineages. Proc Natl Acad Sci USA 101:12957-12962) alleviates this issue by explicitly modeling the variability of selection patterns using a stochastic process. However, the performance of this approach for deciding whether a set of homologous sequences evolved under positive selection at some point has not been assessed yet. This study compares the sensitivity and specificity of tests for positive selection derived from both the standard and the stochastic approaches using extensive simulations. We show that the two methods have low proportions of type I errors, that is, they tend to be conservative when testing the null hypothesis of no positive selection if sequences truly evolve under neutral or negative selection regimes. Also, the standard approach is more powerful than the stochastic one when the prior knowledge on foreground lineages is correct. When this prior is incorrect, however, the stochastic approach outperforms the standard model in a broad range of conditions. Additional comparisons also suggest that the stochastic branch-site method compares favorably with the recently proposed mixed-effects model of evolution of Murrell et al. (Murrell B, Wertheim JO, Moola S, Weighill T, Scheffler K, Pond SLK. 2012. Detecting individual sites subject to episodic diversifying selection. PLoS Genet. 8:e1002764). Altogether, our results show that the standard branch-site model is well suited to confirmatory analyses, whereas the stochastic approach should be preferred over the standard or the mixed-effects ones for exploratory studies.


Subject(s)
Computer Simulation/standards , Open Reading Frames/genetics , Selection, Genetic , Computational Biology , Evolution, Molecular , Likelihood Functions , Models, Genetic , Sensitivity and Specificity
9.
J Mol Evol ; 80(2): 130-41, 2015 Feb.
Article in English | MEDLINE | ID: mdl-25627928

ABSTRACT

The formyl peptide receptors (FPRs) are a family of chemoattractant receptors with important roles in host defense and the regulation of inflammatory reactions. In humans, three FPR paralogs have been identified (FPR1, FPR2, and FPR3) and may have functionally diversified by gene duplication and adaptive evolution. However, the evolutionary mechanisms operating in the diversification of FPR family genes and the changes in selection pressures have not been characterized to date. Here, we have made a comprehensive evolutionary analysis of FPR genes from mammalian species. Phylogenetic analysis showed that an early duplication was responsible for FPR1 and FPR2/FPR3 splitting, and FPR3 originated from the latest duplication event near the origin of primates. Codon-based tests of positive selection reveal interesting patterns in FPR1 and FPR2 versus FPR3, with the first two genes showing clear evidence of positive selection at some sites while the majority of them evolve under strong negative selection. In contrast, our results suggest that the selective pressure may be relaxed in the FPR3 lineage. Of the six amino acid sites inferred to evolve under positive selection in FPR1 and FPR2, four sites were located in extracellular loops of the protein. The electrostatic potential of the extracellular surface of FPR might be affected more frequently with amino acid substitutions in positively selected sites. Thus, positive selection of FPRs among mammals may reflect a link between changes in the sequence and surface structure of the proteins and is likely to be important in the host's defense against invading pathogens.


Subject(s)
Evolution, Molecular , Mammals/genetics , Receptors, Formyl Peptide/genetics , Animals , Phylogeny , Protein Structure, Tertiary , Receptors, Formyl Peptide/chemistry
10.
Proc Biol Sci ; 282(1806): 20150420, 2015 May 07.
Article in English | MEDLINE | ID: mdl-25876846

ABSTRACT

One of the central objectives in the field of phylodynamics is the quantification of population dynamic processes using genetic sequence data or in some cases phenotypic data. Phylodynamics has been successfully applied to many different processes, such as the spread of infectious diseases, within-host evolution of a pathogen, macroevolution and even language evolution. Phylodynamic analysis requires a probability distribution on phylogenetic trees spanned by the genetic data. Because such a probability distribution is not available for many common stochastic population dynamic processes, coalescent-based approximations assuming deterministic population size changes are widely employed. Key to many population dynamic models, in particular epidemiological models, is a period of exponential population growth during the initial phase. Here, we show that the coalescent does not well approximate stochastic exponential population growth, which is typically modelled by a birth-death process. We demonstrate that introducing demographic stochasticity into the population size function of the coalescent improves the approximation for values of R0 close to 1, but substantial differences remain for large R0. In addition, the computational advantage of using an approximation over exact models vanishes when introducing such demographic stochasticity. These results highlight that we need to increase efforts to develop phylodynamic tools that correctly account for the stochasticity of population dynamic models for inference.


Subject(s)
Birth Rate , Models, Biological , Mortality , Population Dynamics , Population Growth , Stochastic Processes
11.
Syst Biol ; 63(5): 743-52, 2014 Sep.
Article in English | MEDLINE | ID: mdl-24929898

ABSTRACT

Competition between organisms influences the processes governing the colonization of new habitats. As a consequence, species or populations arriving first at a suitable location may prevent secondary colonization. Although adaptation to environmental variables (e.g., temperature, altitude, etc.) is essential, the presence or absence of certain species at a particular location often depends on whether or not competing species co-occur. For example, competition is thought to play an important role in structuring mammalian communities assembly. It can also explain spatial patterns of low genetic diversity following rapid colonization events or the "progression rule" displayed by phylogenies of species found on archipelagos. Despite the potential of competition to maintain populations in isolation, past quantitative analyses have largely ignored it because of the difficulty in designing adequate methods for assessing its impact. We present here a new model that integrates competition and dispersal into a Bayesian phylogeographic framework. Extensive simulations and analysis of real data show that our approach clearly outperforms the traditional Mantel test for detecting correlation between genetic and geographic distances. But most importantly, we demonstrate that competition can be detected with high sensitivity and specificity from the phylogenetic analysis of genetic variation in space.


Subject(s)
Gryllidae/classification , Models, Biological , Animals , Competitive Behavior/physiology , Ecosystem , Gryllidae/genetics , Phylogeography , Population Dynamics
12.
J Math Biol ; 71(6-7): 1387-409, 2015 Dec.
Article in English | MEDLINE | ID: mdl-25716798

ABSTRACT

Accurate estimation of species divergence times from the analysis of genetic sequences relies on probabilistic models of evolution of the rate of molecular evolution. Importantly, while these models describe the sample paths of the substitution rates along a phylogenetic tree, only the (random) average rate can be estimated on each edge. For mathematical convenience, the stochastic nature of these averages is generally ignored. In this article we derive the probabilistic distribution of the average substitution rate assuming a geometric Brownian motion for the sample paths, and we investigate the corresponding error bounds via numerical simulations. In particular we confirm the validity of the gamma approximation proposed in Guindon (Syst Biol 62(1):22-34, 2013) for "small" values of the autocorrelation parameter.


Subject(s)
Evolution, Molecular , Models, Biological , Models, Genetic , Animals , Bayes Theorem , Computer Simulation , Markov Chains , Mathematical Concepts , Models, Statistical , Monte Carlo Method , Phylogeny , Stochastic Processes , Time Factors
13.
Int J Mol Sci ; 16(12): 28472-85, 2015 Dec 01.
Article in English | MEDLINE | ID: mdl-26633372

ABSTRACT

Ten-eleven translocation (TET) proteins, a family of Fe(2+)- and 2-oxoglutarate-dependent dioxygenases, are involved in DNA demethylation. They also help regulate various cellular functions. Three TET paralogs have been identified (TET1, TET2, and TET3) in humans. This study focuses on the evolution of mammalian TET genes. Distinct patterns in TET1 and TET2 vs. TET3 were revealed by codon-based tests of positive selection. Results indicate that TET1 and TET2 genes have experienced positive selection more frequently than TET3 gene, and that the majority of codon sites evolved under strong negative selection. These findings imply that the selective pressure on TET3 may have been relaxed in several lineages during the course of evolution. Our analysis of convergent amino acid substitutions also supports the different evolutionary dynamics among TET gene subfamily members. All of the five amino acid sites that are inferred to have evolved under positive selection in the catalytic domain of TET2 are localized at the protein's outer surface. The adaptive changes of these positively selected amino acid sites could be associated with dynamic interactions between other TET-interacting proteins, and positive selection thus appears to shift the regulatory scheme of TET enzyme function.


Subject(s)
Dioxygenases/genetics , Evolution, Molecular , Mammals/genetics , Multigene Family , Animals , Catalytic Domain , Codon , Dioxygenases/chemistry , Dioxygenases/metabolism , Genetic Variation , Humans , Mammals/metabolism , Models, Molecular , Protein Conformation , Protein Interaction Domains and Motifs , Selection, Genetic
14.
Syst Biol ; 62(1): 22-34, 2013 Jan 01.
Article in English | MEDLINE | ID: mdl-22798331

ABSTRACT

The accuracy and precision of species divergence date estimation from molecular data strongly depend on the models describing the variation of substitution rates along a phylogeny. These models generally assume that rates randomly fluctuate along branches from one node to the next. However, for mathematical convenience, the stochasticity of such a process is ignored when translating these rate trajectories into branch lengths. This study addresses this shortcoming. A new approach is described that explicitly considers the average substitution rates along branches as random quantities, resulting in a more realistic description of the variations of evolutionary rates along lineages. The proposed method provides more precise estimates of the rate autocorrelation parameter as well as divergence times. Also, simulation results indicate that ignoring the stochastic variation of rates along edges can lead to significant overestimation of specific node ages. Altogether, the new approach introduced in this study is a step forward to designing biologically relevant models of rate evolution that are well suited to data sets with dense taxon sampling which are likely to present rate autocorrelation. The computer programme PhyTime, part of the PhyML package and implementing the new approach, is available from http://code.google.com/p/phyml (last accessed 1 August 2012).


Subject(s)
Classification/methods , Genetic Heterogeneity , Models, Genetic , Animals , Butterflies/classification , Butterflies/genetics , Cebidae/classification , Cebidae/genetics , Computer Simulation , Genetic Speciation , HIV-1/classification , HIV-1/genetics , Reproducibility of Results , Rodentia/classification , Rodentia/genetics , Software
15.
Gut ; 62(9): 1347-55, 2013 Sep.
Article in English | MEDLINE | ID: mdl-23242209

ABSTRACT

OBJECTIVE: To examine viral evolutionary changes and their relationship to hepatitis B e antigen (HBeAg) seroconversion. DESIGN: A matched case-control study of HBeAg seroconverters (n = 8) and non-seroconverters (n = 7) with adequate stored sera before seroconversion was performed. Nested PCR, cloning and sequencing of hepatitis B virus (HBV) precore/core gene was performed. Sequences were aligned using Clustal X2.0, followed by construction of phylogenetic trees using Pebble 1.0. Viral diversity, evolutionary rates and positive selection were then analysed. RESULTS: Baseline HBV quasispecies viral diversity was identical in seroconverters and non-seroconverters 10 years before seroconversion but started to increase approximately 3 years later. Concurrently, precore stop codon (PSC) mutations appeared. Some 2 years later, HBV-DNA declined, together with a dramatic reduction in HBeAg titres. Just before HBeAg seroconversion, seroconverters had HBV-DNA levels 2 log lower (p = 0.008), HBeAg titres 310-fold smaller (p = 0.02), PSC mutations > 25% (p < 0.001), viral evolution 8.1-fold higher (p = 0.01) and viral diversity 2.9-fold higher (p < 0.001), compared to non-seroconverters, with a 9.3-fold higher viral diversity than baseline (p = 0.011). Phylogenetic trees in seroconverters showed clustering of separate time points and longer branch lengths than non-seroconverters (p = 0.01). Positive selection was detected in five of eight seroconverters but none in non-seroconverters (p = 0.026). There was significant negative correlation between viral diversity (rs = -0.60, p < 0.001) and HBV-DNA or HBeAg (rs = -0.58, p = 0.006) levels; and positive correlation with PSC mutations (rs = 0.38, p = 0.009). Over time, the significant positive correlation was viral diversity (rs = 0.65, p < 0.001), while negative correlation was HBV-DNA (rs = -0.627, p < 0.001) and HBeAg levels (rs = -0.512, p =0.015). CONCLUSIONS: Cumulative viral evolutionary changes that precede HBeAg seroconversion provide insights into this event that may have implications for therapy.


Subject(s)
DNA, Viral/analysis , Hepatitis B e Antigens/blood , Hepatitis B virus , Hepatitis B, Chronic , Adult , Base Sequence , Biological Evolution , Case-Control Studies , Codon, Terminator , Female , Hepatitis B virus/genetics , Hepatitis B virus/immunology , Hepatitis B, Chronic/immunology , Hepatitis B, Chronic/virology , Humans , Male , Mutation , Phylogeny , Protein Biosynthesis , Time Factors
16.
bioRxiv ; 2024 Jun 06.
Article in English | MEDLINE | ID: mdl-38895258

ABSTRACT

Accurate estimation of the dispersal velocity or speed of evolving organisms is no mean feat. In fact, existing probabilistic models in phylogeography or spatial population genetics generally do not provide an adequate framework to define velocity in a relevant manner. For instance, the very concept of instantaneous speed simply does not exist under one of the most popular approaches that models the evolution of spatial coordinates as Brownian trajectories running along a phylogeny [30]. Here, we introduce a new family of models - the so-called "Phylogenetic Integrated Velocity" (PIV) models - that use Gaussian processes to explicitly model the velocity of evolving lineages instead of focusing on the fluctuation of spatial coordinates over time. We describe the properties of these models and show an increased accuracy of velocity estimates compared to previous approaches. Analyses of West Nile virus data in the U.S.A. indicate that PIV models provide sensible predictions of the dispersal of evolving pathogens at a one-year time horizon. These results demonstrate the feasibility and relevance of predictive phylogeography in monitoring epidemics in time and space.

17.
bioRxiv ; 2024 May 23.
Article in English | MEDLINE | ID: mdl-38645268

ABSTRACT

Genomic data collected from viral outbreaks can be exploited to reconstruct the dispersal history of viral lineages in a two-dimensional space using continuous phylogeographic inference. These spatially explicit reconstructions can subsequently be used to estimate dispersal metrics allowing to unveil the dispersal dynamics and evaluate the capacity to spread among hosts. Heterogeneous sampling intensity of genomic sequences can however impact the accuracy of dispersal insights gained through phylogeographic inference. In our study, we implement a simulation framework to evaluate the robustness of three dispersal metrics - a lineage dispersal velocity, a diffusion coefficient, and an isolation-by-distance signal metric - to the sampling effort. Our results reveal that both the diffusion coefficient and isolation-by-distance signal metrics appear to be robust to the number of samples considered for the phylogeographic reconstruction. We then use these two dispersal metrics to compare the dispersal pattern and capacity of various viruses spreading in animal populations. Our comparative analysis reveals a broad range of isolation-by-distance patterns and diffusion coefficients mostly reflecting the dispersal capacity of the main infected host species but also, in some cases, the likely signature of rapid and/or long-distance dispersal events driven by human-mediated movements through animal trade. Overall, our study provides key recommendations for the lineage dispersal metrics to consider in future studies and illustrates their application to compare the spread of viruses in various settings.

18.
Mol Biol Evol ; 29(6): 1695-701, 2012 Jun.
Article in English | MEDLINE | ID: mdl-22319168

ABSTRACT

In phylogenetic analyses of molecular sequence data, partitioning involves estimating independent models of molecular evolution for different sets of sites in a sequence alignment. Choosing an appropriate partitioning scheme is an important step in most analyses because it can affect the accuracy of phylogenetic reconstruction. Despite this, partitioning schemes are often chosen without explicit statistical justification. Here, we describe two new objective methods for the combined selection of best-fit partitioning schemes and nucleotide substitution models. These methods allow millions of partitioning schemes to be compared in realistic time frames and so permit the objective selection of partitioning schemes even for large multilocus DNA data sets. We demonstrate that these methods significantly outperform previous approaches, including both the ad hoc selection of partitioning schemes (e.g., partitioning by gene or codon position) and a recently proposed hierarchical clustering method. We have implemented these methods in an open-source program, PartitionFinder. This program allows users to select partitioning schemes and substitution models using a range of information-theoretic metrics (e.g., the Bayesian information criterion, akaike information criterion [AIC], and corrected AIC). We hope that PartitionFinder will encourage the objective selection of partitioning schemes and thus lead to improvements in phylogenetic analyses. PartitionFinder is written in Python and runs under Mac OSX 10.4 and above. The program, source code, and a detailed manual are freely available from www.robertlanfear.com/partitionfinder.


Subject(s)
Models, Genetic , Phylogeny , Software , Algorithms , Bayes Theorem , Cluster Analysis , Evolution, Molecular , Likelihood Functions , Selection, Genetic , Sequence Analysis, DNA/methods
19.
Mol Biol Evol ; 29(11): 3345-58, 2012 Nov.
Article in English | MEDLINE | ID: mdl-22617951

ABSTRACT

Molecular evolutionary rate estimates have been shown to depend on the time period over which they are estimated. Factors such as demographic processes, calibration errors, purifying selection, and the heterogeneity of substitution rates among sites (RHAS) are known to affect the accuracy with which rates of evolution are estimated. We use mathematical modeling and Bayesian analyses of simulated sequence alignments to explore how mutational hotspots can lead to time-dependent rate estimates. Mathematical modeling shows that underestimation of molecular rates over increasing time scales is inevitable when RHAS is ignored. Although a gamma distribution is commonly used to model RHAS, we show that when the actual RHAS deviates from a gamma-like distribution, rates can either be under- or overestimated in a time-dependent manner. Simulations performed under different scenarios of RHAS confirm the mathematical modeling and demonstrate the impacts of time-dependent rates on estimates of divergence times. Most notably, erroneous rate estimates can have narrow credibility intervals, leading to false confidence in biased estimates of rates, and node ages. Surprisingly, large errors in estimates of overall molecular rate do not necessarily generate large errors in divergence time estimates. Finally, we illustrate the correlation between time-dependent rate patterns and differential saturation between quickly and slowly evolving sites. Our results suggest that data partitioning or simple nonparametric mixture models of RHAS significantly improve the accuracy with which node ages and substitution rates can be estimated.


Subject(s)
Evolution, Molecular , Mutation/genetics , Base Sequence , Computer Simulation , Genetic Variation , Models, Genetic , Time Factors
20.
J Hepatol ; 58(2): 217-24, 2013 Feb.
Article in English | MEDLINE | ID: mdl-23023011

ABSTRACT

BACKGROUND & AIMS: Increased viral diversity and evolution appear to be a pre-HBeAg-seroconversion feature in spontaneous and interferon-treated seroconverters. The aim of this study was to examine the viral evolution pattern in nucleoside analogue related HBeAg-seroconversion. METHODS: This was a case control study consisting of ten lamivudine-treated HBeAg-seroconverters and ten lamivudine-treated non-seroconverters as matching controls. All patients in this study were followed as long as 6 years after starting lamivudine, and cases had three serum time points before HBeAg-seroconversion while controls had three matching serum time points. Nested PCR, cloning and sequencing of HBV precore/core gene were performed. Sequences were aligned with Clustal X 2.0. Phylogenetic trees were constructed and viral diversity, evolutionary rates and patterns of positive selection were evaluated. RESULTS: After starting lamivudine treatment, HBV viral diversity increased in both seroconverters and non-seroconverters, but seroconverters showed a significantly higher level of viral diversity that persisted over time by 2.1-fold (p = 0.009). The increased viral diversity correlated with reduced HBV DNA levels (p <0.001). Lamivudine-treated seroconverters had significant reduced HBV DNA concurrent with increased viral diversity after starting treatment (p = 0.001, compared to non-seroconverters, and resembled those of interferon-seroconverters published previously). There was evidence of positive selection in seroconverters with significantly increased amino acid changes compared to non-seroconverters (p <0.001), occurring in recognized T-cell and B-cell epitopes. CONCLUSIONS: Lamivudine-treated HBeAg-seroconverters showed a higher viral diversity than non-seroconverters, and the pattern resembled that of interferon-treated seroconverters. The findings strengthen the evidence that increased viral diversity is strongly associated with HBeAg-seroconversion.


Subject(s)
Evolution, Molecular , Hepatitis B e Antigens/blood , Hepatitis B virus/genetics , Hepatitis B virus/immunology , Lamivudine/therapeutic use , Nucleosides/therapeutic use , Administration, Oral , Adult , Antigens, Viral/blood , Case-Control Studies , DNA, Viral/genetics , Female , Humans , Lamivudine/administration & dosage , Male , Nucleosides/administration & dosage , Phylogeny , Reverse Transcriptase Inhibitors/administration & dosage , Reverse Transcriptase Inhibitors/therapeutic use , Serology
SELECTION OF CITATIONS
SEARCH DETAIL