ABSTRACT
As genomic sequence data become increasingly available, inferring the phylogeny of the species as that of concatenated genomic data can be enticing. However, this approach makes for a biased estimator of branch lengths and substitution rates and an inconsistent estimator of tree topology. Bayesian multispecies coalescent (MSC) methods address these issues. This is achieved by constraining a set of gene trees within a species tree and jointly inferring both under a Bayesian framework. However, this approach comes at the cost of increased computational demand. Here, we introduce StarBeast3-a software package for efficient Bayesian inference under the MSC model via Markov chain Monte Carlo. We gain efficiency by introducing cutting-edge proposal kernels and adaptive operators, and StarBeast3 is particularly efficient when a relaxed clock model is applied. Furthermore, gene-tree inference is parallelized, allowing the software to scale with the size of the problem. We validated our software and benchmarked its performance using three real and two synthetic data sets. Our results indicate that StarBeast3 is up to one-and-a-half orders of magnitude faster than StarBeast2, and therefore more than two orders faster than *BEAST, depending on the data set and on the parameter, and can achieve convergence on large data sets with hundreds of genes. StarBeast3 is open-source and is easy to set up with a friendly graphical user interface. [Adaptive; Bayesian inference; BEAST 2; effective population sizes; high performance; multispecies coalescent; parallelization; phylogenetics.].
Subject(s)
Models, Genetic , Software , Bayes Theorem , Markov Chains , Monte Carlo Method , PhylogenyABSTRACT
New Zealand, Australia, Iceland, and Taiwan all saw success in controlling their first waves of Coronavirus Disease 2019 (COVID-19). As islands, they make excellent case studies for exploring the effects of international travel and human movement on the spread of COVID-19. We employed a range of robust phylodynamic methods and genome subsampling strategies to infer the epidemiological history of Severe acute respiratory syndrome coronavirus 2 in these four countries. We compared these results to transmission clusters identified by the New Zealand Ministry of Health by contact tracing strategies. We estimated the effective reproduction number of COVID-19 as 1-1.4 during early stages of the pandemic and show that it declined below 1 as human movement was restricted. We also showed that this disease was introduced many times into each country and that introductions slowed down markedly following the reduction of international travel in mid-March 2020. Finally, we confirmed that New Zealand transmission clusters identified via standard health surveillance strategies largely agree with those defined by genomic data. We have demonstrated how the use of genomic data and computational biology methods can assist health officials in characterising the epidemiology of viral epidemics and for contact tracing.
ABSTRACT
Dengue is a prevalent disease in Colombia and all dengue virus serotypes (DENV-1 to -4) co-circulate in the country since 2001. However, the relative impact of gene flow and local diversification on epidemic dynamics is unknown due to heterogeneous sampling and lack of sufficient genetic data. The region of Santander is one of the areas with the highest incidence of dengue in Colombia. To provide a better understanding of the epidemiology of dengue, we inferred DENV population dynamics using samples collected between 1998 and 2015. We used Bayesian phylogenetic analysis and included 143 new envelope gene sequences from Colombia, mainly from the region of Santander, and 235 published sequences from representative countries in the Americas. We documented one single genotype for each serotype but multiple introductions. Whereas the majority of DENV-1, DENV-2, and DENV-4 strains fell into one single lineage, DENV-3 strains fell into two distinct lineages that co-circulated. The inferred times to the most recent common ancestors for the most recent clades of DENV-1, DENV-2, and DENV-4 fell between 1977 and 1987, and for DENV-3 was around 1995. Demographic reconstructions suggested a gradual increase in viral diversity over time. A phylogeographical analysis underscored that Colombia mainly receives viral lineages and a significant diffusion route between Colombia and Venezuela. Our findings contribute to a better understanding of the viral diversity and dengue epidemiology in Colombia.