ABSTRACT
The emergence of the COVID-19 epidemic in the United States (U.S.) went largely undetected due to inadequate testing. New Orleans experienced one of the earliest and fastest accelerating outbreaks, coinciding with Mardi Gras. To gain insight into the emergence of SARS-CoV-2 in the U.S. and how large-scale events accelerate transmission, we sequenced SARS-CoV-2 genomes during the first wave of the COVID-19 epidemic in Louisiana. We show that SARS-CoV-2 in Louisiana had limited diversity compared to other U.S. states and that one introduction of SARS-CoV-2 led to almost all of the early transmission in Louisiana. By analyzing mobility and genomic data, we show that SARS-CoV-2 was already present in New Orleans before Mardi Gras, and the festival dramatically accelerated transmission. Our study provides an understanding of how superspreading during large-scale events played a key role during the early outbreak in the U.S. and can greatly accelerate epidemics.
Subject(s)
COVID-19/epidemiology , Epidemics , SARS-CoV-2/physiology , COVID-19/transmission , Databases as Topic , Disease Outbreaks , Humans , Louisiana/epidemiology , Phylogeny , Risk Factors , SARS-CoV-2/classification , Texas , Travel , United States/epidemiologyABSTRACT
The 2013-2015 Ebola virus disease (EVD) epidemic is caused by the Makona variant of Ebola virus (EBOV). Early in the epidemic, genome sequencing provided insights into virus evolution and transmission and offered important information for outbreak response. Here, we analyze sequences from 232 patients sampled over 7 months in Sierra Leone, along with 86 previously released genomes from earlier in the epidemic. We confirm sustained human-to-human transmission within Sierra Leone and find no evidence for import or export of EBOV across national borders after its initial introduction. Using high-depth replicate sequencing, we observe both host-to-host transmission and recurrent emergence of intrahost genetic variants. We trace the increasing impact of purifying selection in suppressing the accumulation of nonsynonymous mutations over time. Finally, we note changes in the mucin-like domain of EBOV glycoprotein that merit further investigation. These findings clarify the movement of EBOV within the region and describe viral evolution during prolonged human-to-human transmission.
Subject(s)
Ebolavirus/genetics , Ebolavirus/isolation & purification , Genome, Viral , Hemorrhagic Fever, Ebola/epidemiology , Hemorrhagic Fever, Ebola/virology , Mutation , Biological Evolution , Disease Outbreaks , Ebolavirus/classification , Hemorrhagic Fever, Ebola/transmission , Humans , Sierra Leone/epidemiology , Specimen HandlingABSTRACT
Influenza A viruses (IAV) have caused more documented global pandemics in human history than any other pathogen1,2. High pathogenicity avian influenza (HPAI) viruses belonging to the H5N1 subtype are a leading pandemic risk. Two decades after H5N1 "bird flu" became established in poultry in Southeast Asia, its descendants have resurged3, setting off an H5N1 panzootic in wild birds that is fueled by (a) rapid intercontinental spread, reaching South America and Antarctica for the first time4,5; (b) fast evolution via genomic reassortment6; and (c) frequent spillover into terrestrial7,8 and marine mammals9. The virus has sustained mammal-to-mammal transmission in multiple settings, including European fur farms10,11, South American marine mammals12-15, and US dairy cattle16-19, raising questions about whether humans are next. Historically, swine are considered optimal intermediary hosts that help avian influenza viruses (AIV) adapt to mammals before jumping to humans20. However, the altered ecology of H5N1 has opened the door to new evolutionary pathways. Could dairy cattle, farmed mink, or South American sea lions serve as new mammalian gateways to humans? Here we explore the molecular and ecological factors driving H5N1's sudden expansion in host range and assess the likelihood of different zoonotic pathways leading to an H5N1 pandemic.
ABSTRACT
BACKGROUND: Genome streamlining, the process by which genomes become smaller and encode fewer genes over time, is a common phenomenon among pathogenic bacteria. This reduction is driven by selection for minimized energy expenditure in a nutrient-rich environment. As pathogens evolve to become more reliant on the host, metabolic genes and resulting capabilities are lost in favor of siphoning metabolites from the host. Characterizing genome streamlining, gene loss, and metabolic pathway degradation can be useful in assessing pathogen dependency on host metabolism and identifying potential targets for host-directed therapeutics. RESULTS: PoMeLo (Predictor of Metabolic Loss) is a novel evolutionary genomics-guided computational approach for identifying metabolic gaps in the genomes of pathogenic bacteria. PoMeLo leverages a centralized public database of high-quality genomes and annotations and allows the user to compare an unlimited number of genomes across individual genes and pathways. PoMeLo runs locally using user-friendly prompts in a matter of minutes and generates tabular and visual outputs for users to compare predicted metabolic capacity between groups of bacteria and individual species. Each pathway is assigned a Predicted Metabolic Loss (PML) score to assess the magnitude of genome streamlining. Optionally, PoMeLo places the results in an evolutionary context by including phylogenetic relationships in visual outputs. It can also initially compute phylogenetically-weighted mean genome sizes to identify genome streamlining events. Here, we describe PoMeLo and demonstrate its use in identifying metabolic gaps in genomes of pathogenic Treponema species. CONCLUSIONS: PoMeLo represents an advance over existing methods for identifying metabolic gaps in genomic data, allowing comparison across large numbers of genomes and placing the resulting data in a phylogenetic context. PoMeLo is freely available for academic and non-academic use at https://github.com/czbiohub-sf/pomelo .
Subject(s)
Genome , Genomics , Phylogeny , Genomics/methods , Biological Evolution , Bacteria/genetics , SoftwareABSTRACT
IMPORTANCE: The number of known virus species has increased dramatically through metagenomic studies, which search genetic material sampled from a host for non-host genes. Here, we focus on an important viral family that includes influenza viruses, the Orthomyxoviridae, with over 100 recently discovered viruses infecting hosts from humans to fish. We find that one virus called WÇhàn mosquito virus 6, discovered in mosquitoes in China, has spread across the globe very recently. Surface proteins used to enter cells show signs of rapid evolution in WÇhàn mosquito virus 6 and its relatives which suggests an ability to infect vertebrate animals. We compute the rate at which new orthomyxovirus species discovered add evolutionary history to the tree of life, predict that many viruses remain to be discovered, and discuss what appropriately designed future studies can teach us about how diseases cross between continents and species.
Subject(s)
Genome, Viral , Orthomyxoviridae , Evolution, Molecular , Orthomyxoviridae/genetics , Phylogeny , MetagenomicsABSTRACT
We investigate the emergence, mutation profile, and dissemination of SARS-CoV-2 lineage B.1.214.2, first identified in Belgium in January 2021. This variant, featuring a 3-amino acid insertion in the spike protein similar to the Omicron variant, was speculated to enhance transmissibility or immune evasion. Initially detected in international travelers, it substantially transmitted in Central Africa, Belgium, Switzerland, and France, peaking in April 2021. Our travel-aware phylogeographic analysis, incorporating travel history, estimated the origin to the Republic of the Congo, with primary European entry through France and Belgium, and multiple smaller introductions during the epidemic. We correlate its spread with human travel patterns and air passenger data. Further, upon reviewing national reports of SARS-CoV-2 outbreaks in Belgian nursing homes, we found this strain caused moderately severe outcomes (8.7% case fatality ratio). A distinct nasopharyngeal immune response was observed in elderly patients, characterized by 80% unique signatures, higher B- and T-cell activation, increased type I IFN signaling, and reduced NK, Th17, and complement system activation, compared to similar outbreaks. This unique immune response may explain the variant's epidemiological behavior and underscores the need for nasal vaccine strategies against emerging variants.
Subject(s)
COVID-19 , SARS-CoV-2 , Spike Glycoprotein, Coronavirus , Humans , SARS-CoV-2/genetics , SARS-CoV-2/immunology , COVID-19/immunology , COVID-19/virology , COVID-19/epidemiology , Spike Glycoprotein, Coronavirus/genetics , Spike Glycoprotein, Coronavirus/immunology , Aged , Male , Travel , Belgium/epidemiology , Middle Aged , Female , Adult , Phylogeography , Nasopharynx/virologyABSTRACT
The 2013-2016 West African epidemic caused by the Ebola virus was of unprecedented magnitude, duration and impact. Here we reconstruct the dispersal, proliferation and decline of Ebola virus throughout the region by analysing 1,610 Ebola virus genomes, which represent over 5% of the known cases. We test the association of geography, climate and demography with viral movement among administrative regions, inferring a classic 'gravity' model, with intense dispersal between larger and closer populations. Despite attenuation of international dispersal after border closures, cross-border transmission had already sown the seeds for an international epidemic, rendering these measures ineffective at curbing the epidemic. We address why the epidemic did not spread into neighbouring countries, showing that these countries were susceptible to substantial outbreaks but at lower risk of introductions. Finally, we reveal that this large epidemic was a heterogeneous and spatially dissociated collection of transmission clusters of varying size, duration and connectivity. These insights will help to inform interventions in future epidemics.
Subject(s)
Ebolavirus/genetics , Ebolavirus/physiology , Genome, Viral/genetics , Hemorrhagic Fever, Ebola/transmission , Hemorrhagic Fever, Ebola/virology , Climate , Disease Outbreaks/statistics & numerical data , Ebolavirus/isolation & purification , Geography , Hemorrhagic Fever, Ebola/epidemiology , Humans , Internationality , Linear Models , Molecular Epidemiology , Phylogeny , Travel/legislation & jurisprudence , Travel/statistics & numerical dataABSTRACT
Zika virus (ZIKV) is causing an unprecedented epidemic linked to severe congenital abnormalities. In July 2016, mosquito-borne ZIKV transmission was reported in the continental United States; since then, hundreds of locally acquired infections have been reported in Florida. To gain insights into the timing, source, and likely route(s) of ZIKV introduction, we tracked the virus from its first detection in Florida by sequencing ZIKV genomes from infected patients and Aedes aegypti mosquitoes. We show that at least 4 introductions, but potentially as many as 40, contributed to the outbreak in Florida and that local transmission is likely to have started in the spring of 2016-several months before its initial detection. By analysing surveillance and genetic data, we show that ZIKV moved among transmission zones in Miami. Our analyses show that most introductions were linked to the Caribbean, a finding corroborated by the high incidence rates and traffic volumes from the region into the Miami area. Our study provides an understanding of how ZIKV initiates transmission in new regions.
Subject(s)
Zika Virus Infection/epidemiology , Zika Virus Infection/virology , Zika Virus/genetics , Aedes/virology , Animals , Caribbean Region/epidemiology , Disease Outbreaks/statistics & numerical data , Female , Florida/epidemiology , Genome, Viral/genetics , Humans , Incidence , Molecular Epidemiology , Mosquito Vectors/virology , Zika Virus/isolation & purification , Zika Virus Infection/transmissionABSTRACT
Reassortment is an important source of genetic diversity in segmented viruses and is the main source of novel pathogenic influenza viruses. Despite this, studying the reassortment process has been constrained by the lack of a coherent, model-based inference framework. Here, we introduce a coalescent-based model that allows us to explicitly model the joint coalescent and reassortment process. In order to perform inference under this model, we present an efficient Markov chain Monte Carlo algorithm to sample rooted networks and the embedding of phylogenetic trees within networks. This algorithm provides the means to jointly infer coalescent and reassortment rates with the reassortment network and the embedding of segments in that network from full-genome sequence data. Studying reassortment patterns of different human influenza datasets, we find large differences in reassortment rates across different human influenza viruses. Additionally, we find that reassortment events predominantly occur on selectively fitter parts of reassortment networks showing that on a population level, reassortment positively contributes to the fitness of human influenza viruses.
Subject(s)
Influenza, Human/virology , Models, Genetic , Orthomyxoviridae/genetics , Reassortant Viruses/genetics , Algorithms , Evolution, Molecular , Genome, Viral/genetics , Humans , Models, Statistical , PhylogenyABSTRACT
The 2013-2016 epidemic of Ebola virus disease in West Africa was of unprecedented magnitude and changed our perspective on this lethal but sporadically emerging virus. This outbreak also marked the beginning of large-scale real-time molecular epidemiology. Here, we show how evolutionary analyses of Ebola virus genome sequences provided key insights into virus origins, evolution and spread during the epidemic. We provide basic scientists, epidemiologists, medical practitioners and other outbreak responders with an enhanced understanding of the utility and limitations of pathogen genomic sequencing. This will be crucially important in our attempts to track and control future infectious disease outbreaks.
Subject(s)
Ebolavirus/genetics , Evolution, Molecular , Hemorrhagic Fever, Ebola/epidemiology , Hemorrhagic Fever, Ebola/virology , Animals , Ebolavirus/classification , Genome, Viral/genetics , Humans , Molecular Epidemiology , Phenotype , Public HealthABSTRACT
The Ebola virus disease epidemic in West Africa is the largest on record, responsible for over 28,599 cases and more than 11,299 deaths. Genome sequencing in viral outbreaks is desirable to characterize the infectious agent and determine its evolutionary rate. Genome sequencing also allows the identification of signatures of host adaptation, identification and monitoring of diagnostic targets, and characterization of responses to vaccines and treatments. The Ebola virus (EBOV) genome substitution rate in the Makona strain has been estimated at between 0.87 × 10(-3) and 1.42 × 10(-3) mutations per site per year. This is equivalent to 16-27 mutations in each genome, meaning that sequences diverge rapidly enough to identify distinct sub-lineages during a prolonged epidemic. Genome sequencing provides a high-resolution view of pathogen evolution and is increasingly sought after for outbreak surveillance. Sequence data may be used to guide control measures, but only if the results are generated quickly enough to inform interventions. Genomic surveillance during the epidemic has been sporadic owing to a lack of local sequencing capacity coupled with practical difficulties transporting samples to remote sequencing facilities. To address this problem, here we devise a genomic surveillance system that utilizes a novel nanopore DNA sequencing instrument. In April 2015 this system was transported in standard airline luggage to Guinea and used for real-time genomic surveillance of the ongoing epidemic. We present sequence data and analysis of 142 EBOV samples collected during the period March to October 2015. We were able to generate results less than 24 h after receiving an Ebola-positive sample, with the sequencing process taking as little as 15-60 min. We show that real-time genomic surveillance is possible in resource-limited settings and can be established rapidly to monitor outbreaks.
Subject(s)
Ebolavirus/genetics , Epidemiological Monitoring , Genome, Viral/genetics , Hemorrhagic Fever, Ebola/epidemiology , Hemorrhagic Fever, Ebola/virology , Sequence Analysis, DNA/instrumentation , Sequence Analysis, DNA/methods , Aircraft , Disease Outbreaks/statistics & numerical data , Ebolavirus/classification , Ebolavirus/pathogenicity , Guinea/epidemiology , Humans , Mutagenesis/genetics , Mutation Rate , Time FactorsABSTRACT
Coalescent theory combined with statistical modeling allows us to estimate effective population size fluctuations from molecular sequences of individuals sampled from a population of interest. When sequences are sampled serially through time and the distribution of the sampling times depends on the effective population size, explicit statistical modeling of sampling times improves population size estimation. Previous work assumed that the genealogy relating sampled sequences is known and modeled sampling times as an inhomogeneous Poisson process with log-intensity equal to a linear function of the log-transformed effective population size. We improve this approach in two ways. First, we extend the method to allow for joint Bayesian estimation of the genealogy, effective population size trajectory, and other model parameters. Next, we improve the sampling time model by incorporating additional sources of information in the form of time-varying covariates. We validate our new modeling framework using a simulation study and apply our new methodology to analyses of population dynamics of seasonal influenza and to the recent Ebola virus outbreak in West Africa.
Subject(s)
Genetics, Population/methods , Models, Statistical , Population Density , Bayes Theorem , Computational Biology , Ebolavirus/genetics , Genome, Viral/genetics , Hemorrhagic Fever, Ebola/epidemiology , Hemorrhagic Fever, Ebola/virology , Humans , Influenza, Human/epidemiology , Influenza, Human/virology , Orthomyxoviridae/genetics , Population DynamicsABSTRACT
Middle East respiratory syndrome coronavirus (MERS-CoV) causes a zoonotic respiratory disease of global public health concern, and dromedary camels are the only proven source of zoonotic infection. Although MERS-CoV infection is ubiquitous in dromedaries across Africa as well as in the Arabian Peninsula, zoonotic disease appears confined to the Arabian Peninsula. MERS-CoVs from Africa have hitherto been poorly studied. We genetically and phenotypically characterized MERS-CoV from dromedaries sampled in Morocco, Burkina Faso, Nigeria, and Ethiopia. Viruses from Africa (clade C) are phylogenetically distinct from contemporary viruses from the Arabian Peninsula (clades A and B) but remain antigenically similar in microneutralization tests. Viruses from West (Nigeria, Burkina Faso) and North (Morocco) Africa form a subclade, C1, that shares clade-defining genetic signatures including deletions in the accessory gene ORF4b Compared with human and camel MERS-CoV from Saudi Arabia, virus isolates from Burkina Faso (BF785) and Nigeria (Nig1657) had lower virus replication competence in Calu-3 cells and in ex vivo cultures of human bronchus and lung. BF785 replicated to lower titer in lungs of human DPP4-transduced mice. A reverse genetics-derived recombinant MERS-CoV (EMC) lacking ORF4b elicited higher type I and III IFN responses than the isogenic EMC virus in Calu-3 cells. However, ORF4b deletions may not be the major determinant of the reduced replication competence of BF785 and Nig1657. Genetic and phenotypic differences in West African viruses may be relevant to zoonotic potential. There is an urgent need for studies of MERS-CoV at the animal-human interface.
Subject(s)
Camelus/virology , Genetic Variation , Middle East Respiratory Syndrome Coronavirus/genetics , Middle East Respiratory Syndrome Coronavirus/pathogenicity , Africa , Animals , Coronavirus Infections/veterinary , Coronavirus Infections/virology , Female , Humans , Lung/virology , Mice, Inbred C57BL , Phylogeny , Virus Replication , Zoonoses/virologyABSTRACT
BACKGROUND: Inexpensive pathogen genome sequencing has had a transformative effect on the field of phylodynamics, where ever increasing volumes of data have promised real-time insight into outbreaks of infectious disease. As well as the sheer volume of pathogen isolates being sequenced, the sequencing of whole pathogen genomes, rather than select loci, has allowed phylogenetic analyses to be carried out at finer time scales, often approaching serial intervals for infections caused by rapidly evolving RNA viruses. Despite its utility, whole genome sequencing of pathogens has not been adopted universally and targeted sequencing of loci is common in some pathogen-specific fields. RESULTS: In this study we highlighted the utility of sequencing whole genomes of pathogens by re-analysing a well-characterised collection of Ebola virus sequences in the form of complete viral genomes (≈19 kb long) or the rapidly evolving glycoprotein (GP, ≈2 kb long) gene. We have quantified changes in phylogenetic, temporal, and spatial inference resolution as a result of this reduction in data and compared these to theoretical expectations. CONCLUSIONS: We propose a simple intuitive metric for quantifying temporal resolution, i.e. the time scale over which sequence data might be informative of various processes as a quick back-of-the-envelope calculation of statistical power available to molecular clock analyses.
Subject(s)
Ebolavirus/genetics , Genes, Viral , Genome, Viral , Hemorrhagic Fever, Ebola/epidemiology , Chromosome Mapping , Disease Outbreaks , Hemorrhagic Fever, Ebola/transmission , Hemorrhagic Fever, Ebola/virology , Humans , Markov Chains , Phylogeny , Whole Genome SequencingABSTRACT
The global-scale epidemiology and genome-wide evolutionary dynamics of influenza B remain poorly understood compared with influenza A viruses. We compiled a spatio-temporally comprehensive dataset of influenza B viruses, comprising over 2,500 genomes sampled worldwide between 1987 and 2015, including 382 newly-sequenced genomes that fill substantial gaps in previous molecular surveillance studies. Our contributed data increase the number of available influenza B virus genomes in Europe, Africa and Central Asia, improving the global context to study influenza B viruses. We reveal Yamagata-lineage diversity results from co-circulation of two antigenically-distinct groups that also segregate genetically across the entire genome, without evidence of intra-lineage reassortment. In contrast, Victoria-lineage diversity stems from geographic segregation of different genetic clades, with variability in the degree of geographic spread among clades. Differences between the lineages are reflected in their antigenic dynamics, as Yamagata-lineage viruses show alternating dominance between antigenic groups, while Victoria-lineage viruses show antigenic drift of a single lineage. Structural mapping of amino acid substitutions on trunk branches of influenza B gene phylogenies further supports these antigenic differences and highlights two potential mechanisms of adaptation for polymerase activity. Our study provides new insights into the epidemiological and molecular processes shaping influenza B virus evolution globally.
Subject(s)
Influenza B virus/genetics , Influenza, Human/epidemiology , Influenza, Human/virology , Amino Acid Substitution , Antigenic Variation , Antigens, Viral/genetics , Databases, Genetic , Evolution, Molecular , Genetic Variation , Genome, Viral , Global Health , Hemagglutinin Glycoproteins, Influenza Virus/genetics , Humans , Influenza B virus/classification , Influenza B virus/immunology , Models, Molecular , Molecular Epidemiology , Phylogeny , RNA-Dependent RNA Polymerase/chemistry , RNA-Dependent RNA Polymerase/genetics , Reassortant Viruses/genetics , Viral Proteins/chemistry , Viral Proteins/geneticsABSTRACT
BACKGROUND: Several patients with Ebola virus disease (EVD) managed in the United States have received ZMapp monoclonal antibodies, TKM-Ebola small interfering RNA, brincidofovir, and/or convalescent plasma as investigational therapeutics. METHODS: To investigate whether treatment selected for Ebola virus (EBOV) mutations conferring resistance, viral sequencing was performed on RNA extracted from clinical blood specimens from patients with EVD following treatment, and putative viral targets were analyzed. RESULTS: We observed no major or minor EBOV mutations within regions targeted by therapeutics. CONCLUSIONS: This small subset of patients and clinical specimens suggests that evolution of resistance is not a direct consequence of antiviral treatment. As EVD antiviral treatments are introduced into wider use, it is essential that continuous viral full-genome surveillance is performed, to monitor for the emergence of escape mutations.
Subject(s)
Antibodies, Monoclonal/therapeutic use , Antiviral Agents/therapeutic use , Ebolavirus/drug effects , Genome, Viral/genetics , Hemorrhagic Fever, Ebola/drug therapy , RNA, Small Interfering/therapeutic use , Convalescence , Drug Resistance, Viral , Ebolavirus/genetics , Ebolavirus/immunology , Evolution, Molecular , Hemorrhagic Fever, Ebola/immunology , Hemorrhagic Fever, Ebola/virology , High-Throughput Nucleotide Sequencing , Humans , Molecular Epidemiology , Mutation , Plasma , Sequence Analysis, DNAABSTRACT
Influenza B viruses make a considerable contribution to morbidity attributed to seasonal influenza. Currently circulating influenza B isolates are known to belong to two antigenically distinct lineages referred to as B/Victoria and B/Yamagata. Frequent exchange of genomic segments of these two lineages has been noted in the past, but the observed patterns of reassortment have not been formalized in detail. We investigate interlineage reassortments by comparing phylogenetic trees across genomic segments. Our analyses indicate that of the eight segments of influenza B viruses only segments coding for polymerase basic 1 and 2 (PB1 and PB2) and hemagglutinin (HA) proteins have maintained separate Victoria and Yamagata lineages and that currently circulating strains possess PB1, PB2, and HA segments derived entirely from one or the other lineage; other segments have repeatedly reassorted between lineages thereby reducing genetic diversity. We argue that this difference between segments is due to selection against reassortant viruses with mixed-lineage PB1, PB2, and HA segments. Given sufficient time and continued recruitment to the reassortment-isolated PB1-PB2-HA gene complex, we expect influenza B viruses to eventually undergo sympatric speciation.
Subject(s)
Hemagglutinin Glycoproteins, Influenza Virus/genetics , Influenza B virus/classification , Influenza B virus/genetics , Viral Proteins/genetics , Computational Biology/methods , Evolution, Molecular , Genetic Speciation , Genome, Viral , Humans , Influenza, Human/blood , Influenza, Human/virology , Phylogeny , Selection, GeneticABSTRACT
Modern phylogenetics research is often performed within a Bayesian framework, using sampling algorithms such as Markov chain Monte Carlo (MCMC) to approximate the posterior distribution. These algorithms require careful evaluation of the quality of the generated samples. Within the field of phylogenetics, one frequently adopted diagnostic approach is to evaluate the effective sample size (ESS) and to investigate trace graphs of the sampled parameters. A major limitation of these approaches is that they are developed for continuous parameters and therefore incompatible with a crucial parameter in these inferences: the tree topology. Several recent advancements have aimed at extending these diagnostics to topological space. In this reflection paper, we present two case studies - one on Ebola virus and one on HIV - illustrating how these topological diagnostics can contain information not found in standard diagnostics, and how decisions regarding which of these diagnostics to compute can impact inferences regarding MCMC convergence and mixing. Our results show the importance of running multiple replicate analyses and of carefully assessing topological convergence using the output of these replicate analyses. To this end, we illustrate different ways of assessing and visualizing the topological convergence of these replicates. Given the major importance of detecting convergence and mixing issues in Bayesian phylogenetic analyses, the lack of a unified approach to this problem warrants further action, especially now that additional tools are becoming available to researchers.
ABSTRACT
Phylodynamics is a set of population genetics tools that aim at reconstructing demographic history of a population based on molecular sequences of individuals sampled from the population of interest. One important task in phylodynamics is to estimate changes in (effective) population size. When applied to infectious disease sequences such estimation of population size trajectories can provide information about changes in the number of infections. To model changes in the number of infected individuals, current phylodynamic methods use non-parametric approaches (e.g., Bayesian curve-fitting based on change-point models or Gaussian process priors), parametric approaches (e.g., based on differential equations), and stochastic modeling in conjunction with likelihood-free Bayesian methods. The first class of methods yields results that are hard to interpret epidemiologically. The second class of methods provides estimates of important epidemiological parameters, such as infection and removal/recovery rates, but ignores variation in the dynamics of infectious disease spread. The third class of methods is the most advantageous statistically, but relies on computationally intensive particle filtering techniques that limits its applications. We propose a Bayesian model that combines phylodynamic inference and stochastic epidemic models, and achieves computational tractability by using a linear noise approximation (LNA) - a technique that allows us to approximate probability densities of stochastic epidemic model trajectories. LNA opens the door for using modern Markov chain Monte Carlo tools to approximate the joint posterior distribution of the disease transmission parameters and of high dimensional vectors describing unobserved changes in the stochastic epidemic model compartment sizes (e.g., numbers of infectious and susceptible individuals). In a simulation study, we show that our method can successfully recover parameters of stochastic epidemic models. We apply our estimation technique to Ebola genealogies estimated using viral genetic data from the 2014 epidemic in Sierra Leone and Liberia.
ABSTRACT
OBJECTIVES: The origin and spread of dengue virus (DENV) circulating in Africa remain poorly characterized, with African sequences representing <1% of global sequence data. METHODS: Whole genome sequencing was performed on serum samples (n = 29) from an undifferentiated fever study in 2016 in the Democratic Republic of Congo (DRC), and from febrile travelers returning from Africa. The evolutionary history of the newly acquired African DENV-1 (n = 1) and cosmopolitan genotype DENV-2 (n = 18) genomes was reconstructed using a phylogeographic, time-scaled Bayesian analysis on a curated DENV panel including all known African sequences. RESULTS: A minimum of 10 and eight introductions could be identified into Africa for DENV-1 and cosmopolitan DENV-2, respectively, almost all originating from Asia. Three introductions were previously unknown. The currently circulating virus comprises mainly the recently introduced clades and one long-established African clade. Robust geographical clustering suggests limited spread of DENV after each introduction. Our data identified the DRC as the source of the 2018 Angolan DENV-2 epidemic, and similarly, the 2013 Angolan DENV-1 outbreak as the origin of our DRC study. CONCLUSION: Active genomic surveillance of DENV in Africa at the portals of entry might help early outbreak response and limit sero- and genotype spread and human disease burden.