RESUMO
The Alpha, Beta, and Gamma SARS-CoV-2 variants of concern (VOCs) co-circulated globally during 2020 and 2021, fueling waves of infections. They were displaced by Delta during a third wave worldwide in 2021, which, in turn, was displaced by Omicron in late 2021. In this study, we use phylogenetic and phylogeographic methods to reconstruct the dispersal patterns of VOCs worldwide. We find that source-sink dynamics varied substantially by VOC and identify countries that acted as global and regional hubs of dissemination. We demonstrate the declining role of presumed origin countries of VOCs in their global dispersal, estimating that India contributed <15% of Delta exports and South Africa <1%-2% of Omicron dispersal. We estimate that >80 countries had received introductions of Omicron within 100 days of its emergence, associated with accelerated passenger air travel and higher transmissibility. Our study highlights the rapid dispersal of highly transmissible variants, with implications for genomic surveillance along the hierarchical airline network.
Assuntos
Viagem Aérea , COVID-19 , Humanos , Filogenia , SARS-CoV-2RESUMO
We present evidence for multiple independent origins of recombinant SARS-CoV-2 viruses sampled from late 2020 and early 2021 in the United Kingdom. Their genomes carry single-nucleotide polymorphisms and deletions that are characteristic of the B.1.1.7 variant of concern but lack the full complement of lineage-defining mutations. Instead, the remainder of their genomes share contiguous genetic variation with non-B.1.1.7 viruses circulating in the same geographic area at the same time as the recombinants. In four instances, there was evidence for onward transmission of a recombinant-origin virus, including one transmission cluster of 45 sequenced cases over the course of 2 months. The inferred genomic locations of recombination breakpoints suggest that every community-transmitted recombinant virus inherited its spike region from a B.1.1.7 parental virus, consistent with a transmission advantage for B.1.1.7's set of mutations.
Assuntos
COVID-19/epidemiologia , COVID-19/transmissão , Pandemias , Recombinação Genética , SARS-CoV-2/genética , Sequência de Bases/genética , COVID-19/virologia , Biologia Computacional/métodos , Frequência do Gene , Genoma Viral , Genótipo , Humanos , Mutação , Filogenia , Polimorfismo de Nucleotídeo Único , Reino Unido/epidemiologia , Sequenciamento Completo do Genoma/métodosRESUMO
Global dispersal and increasing frequency of the SARS-CoV-2 spike protein variant D614G are suggestive of a selective advantage but may also be due to a random founder effect. We investigate the hypothesis for positive selection of spike D614G in the United Kingdom using more than 25,000 whole genome SARS-CoV-2 sequences. Despite the availability of a large dataset, well represented by both spike 614 variants, not all approaches showed a conclusive signal of positive selection. Population genetic analysis indicates that 614G increases in frequency relative to 614D in a manner consistent with a selective advantage. We do not find any indication that patients infected with the spike 614G variant have higher COVID-19 mortality or clinical severity, but 614G is associated with higher viral load and younger age of patients. Significant differences in growth and size of 614G phylogenetic clusters indicate a need for continued study of this variant.
Assuntos
Substituição de Aminoácidos , COVID-19/transmissão , COVID-19/virologia , SARS-CoV-2/genética , SARS-CoV-2/patogenicidade , Glicoproteína da Espícula de Coronavírus/genética , Ácido Aspártico/análise , Ácido Aspártico/genética , COVID-19/epidemiologia , Genoma Viral , Glicina/análise , Glicina/genética , Humanos , Mutação , SARS-CoV-2/crescimento & desenvolvimento , Reino Unido/epidemiologia , Virulência , Sequenciamento Completo do GenomaRESUMO
Coronavirus disease 2019 (COVID-19) is caused by SARS-CoV-2 infection and was first reported in central China in December 2019. Extensive molecular surveillance in Guangdong, China's most populous province, during early 2020 resulted in 1,388 reported RNA-positive cases from 1.6 million tests. In order to understand the molecular epidemiology and genetic diversity of SARS-CoV-2 in China, we generated 53 genomes from infected individuals in Guangdong using a combination of metagenomic sequencing and tiling amplicon approaches. Combined epidemiological and phylogenetic analyses indicate multiple independent introductions to Guangdong, although phylogenetic clustering is uncertain because of low virus genetic variation early in the pandemic. Our results illustrate how the timing, size, and duration of putative local transmission chains were constrained by national travel restrictions and by the province's large-scale intensive surveillance and intervention measures. Despite these successes, COVID-19 surveillance in Guangdong is still required, because the number of cases imported from other countries has increased.
Assuntos
Betacoronavirus/genética , Infecções por Coronavirus/epidemiologia , Pneumonia Viral/epidemiologia , Teorema de Bayes , COVID-19 , China/epidemiologia , Infecções por Coronavirus/virologia , Monitoramento Epidemiológico , Humanos , Funções Verossimilhança , Pandemias , Pneumonia Viral/virologia , SARS-CoV-2 , ViagemRESUMO
The emergence and spread of Zika virus in the Americas continues to challenge our disease surveillance systems. Virus genome sequencing during the epidemic uncovered the timescale of Zika virus transmission and spread. Yet, we are only beginning to explore how genomics can enhance our responses to emerging viruses.
Assuntos
Genoma Viral/genética , Genômica/métodos , Infecção por Zika virus/transmissão , Zika virus/genética , América/epidemiologia , Brasil/epidemiologia , Doenças Transmissíveis Emergentes/epidemiologia , Doenças Transmissíveis Emergentes/transmissão , Doenças Transmissíveis Emergentes/virologia , Epidemias , Geografia , Humanos , Zika virus/patogenicidade , Infecção por Zika virus/virologiaRESUMO
Determining the transmissibility, prevalence and patterns of movement of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infections is central to our understanding of the impact of the pandemic and to the design of effective control strategies. Phylogenies (evolutionary trees) have provided key insights into the international spread of SARS-CoV-2 and enabled investigation of individual outbreaks and transmission chains in specific settings. Phylodynamic approaches combine evolutionary, demographic and epidemiological concepts and have helped track virus genetic changes, identify emerging variants and inform public health strategy. Here, we review and synthesize studies that illustrate how phylogenetic and phylodynamic techniques were applied during the first year of the pandemic, and summarize their contributions to our understanding of SARS-CoV-2 transmission and control.
Assuntos
COVID-19 , Pandemias , Humanos , Pandemias/prevenção & controle , Filogenia , SARS-CoV-2/genéticaRESUMO
The SARS-CoV-2 Delta (Pango lineage B.1.617.2) variant of concern spread globally, causing resurgences of COVID-19 worldwide1,2. The emergence of the Delta variant in the UK occurred on the background of a heterogeneous landscape of immunity and relaxation of non-pharmaceutical interventions. Here we analyse 52,992 SARS-CoV-2 genomes from England together with 93,649 genomes from the rest of the world to reconstruct the emergence of Delta and quantify its introduction to and regional dissemination across England in the context of changing travel and social restrictions. Using analysis of human movement, contact tracing and virus genomic data, we find that the geographic focus of the expansion of Delta shifted from India to a more global pattern in early May 2021. In England, Delta lineages were introduced more than 1,000 times and spread nationally as non-pharmaceutical interventions were relaxed. We find that hotel quarantine for travellers reduced onward transmission from importations; however, the transmission chains that later dominated the Delta wave in England were seeded before travel restrictions were introduced. Increasing inter-regional travel within England drove the nationwide dissemination of Delta, with some cities receiving more than 2,000 observable lineage introductions from elsewhere. Subsequently, increased levels of local population mixing-and not the number of importations-were associated with the faster relative spread of Delta. The invasion dynamics of Delta depended on spatial heterogeneity in contact patterns, and our findings will inform optimal spatial interventions to reduce the transmission of current and future variants of concern, such as Omicron (Pango lineage B.1.1.529).
Assuntos
COVID-19 , SARS-CoV-2 , COVID-19/epidemiologia , COVID-19/prevenção & controle , COVID-19/transmissão , COVID-19/virologia , Cidades/epidemiologia , Busca de Comunicante , Inglaterra/epidemiologia , Genoma Viral/genética , Humanos , Quarentena/legislação & jurisprudência , SARS-CoV-2/genética , SARS-CoV-2/crescimento & desenvolvimento , SARS-CoV-2/isolamento & purificação , Viagem/legislação & jurisprudênciaRESUMO
Migratory birds play a critical role in the rapid spread of highly pathogenic avian influenza (HPAI) H5N8 virus clade 2.3.4.4 across Eurasia. Elucidating the timing and pattern of virus transmission is essential therefore for understanding the spatial dissemination of these viruses. In this study, we surveyed >27,000 wild birds in China, tracked the year-round migration patterns of 20 bird species across China since 2006, and generated new HPAI H5N8 virus genomic data. Using this new data set, we investigated the seasonal transmission dynamics of HPAI H5N8 viruses across Eurasia. We found that introductions of HPAI H5N8 viruses to different Eurasian regions were associated with the seasonal migration of wild birds. Moreover, we report a backflow of HPAI H5N8 virus lineages from Europe to Asia, suggesting that Europe acts as both a source and a sink in the global HPAI virus transmission network.
Assuntos
Vírus da Influenza A Subtipo H5N8 , Vírus da Influenza A , Influenza Aviária , Animais , Vírus da Influenza A Subtipo H5N8/genética , Aves , Vírus da Influenza A/genética , Animais Selvagens , Influenza Aviária/epidemiologia , Europa (Continente)/epidemiologia , Ásia/epidemiologia , Filogenia , Surtos de DoençasRESUMO
Foot-and-mouth disease is a highly contagious disease affecting cloven-hoofed animals, resulting in considerable economic losses. Its causal agent is foot-and-mouth disease virus (FMDV), a picornavirus. Due to its error-prone replication and rapid evolution, the transmission and evolutionary dynamics of FMDV can be studied using genomic epidemiological approaches. To analyze FMDV evolution and identify possible transmission routes in an Argentinean region, field samples that tested positive for FMDV by PCR were obtained from 21 farms located in the Mar Chiquita district. Whole FMDV genome sequences were obtained by PCR amplification in seven fragments and sequencing using the Sanger technique. The genome sequences obtained from these samples were then analyzed using phylogenetic, phylogeographic, and evolutionary approaches. Three local transmission clusters were detected among the sampled viruses. The dataset was analyzed using Bayesian phylodynamic methods with appropriate coalescent and relaxed molecular clock models. The estimated mean viral evolutionary rate was 1.17 × 10- 2 substitutions/site/year. No significant differences in the rate of viral evolution were observed between farms with vaccinated animals and those with unvaccinated animals. The most recent common ancestor of the sampled sequences was dated to approximately one month before the first reported case in the outbreak. Virus transmission started in the south of the district and later dispersed to the west, and finally arrived in the east. Different transmission routes among the studied herds, such as non-replicating vectors and close contact contagion (i.e., aerosols), may be responsible for viral spread.
Assuntos
Vírus da Febre Aftosa , Picornaviridae , Animais , Vírus da Febre Aftosa/genética , Argentina/epidemiologia , Teorema de Bayes , FilogeniaRESUMO
High-throughput sequencing enables rapid genome sequencing during infectious disease outbreaks and provides an opportunity to quantify the evolutionary dynamics of pathogens in near real-time. One difficulty of undertaking evolutionary analyses over short timescales is the dependency of the inferred evolutionary parameters on the timespan of observation. Crucially, there are an increasing number of molecular clock analyses using external evolutionary rate priors to infer evolutionary parameters. However, it is not clear which rate prior is appropriate for a given time window of observation due to the time-dependent nature of evolutionary rate estimates. Here, we characterize the molecular evolutionary dynamics of SARS-CoV-2 and 2009 pandemic H1N1 (pH1N1) influenza during the first 12 months of their respective pandemics. We use Bayesian phylogenetic methods to estimate the dates of emergence, evolutionary rates, and growth rates of SARS-CoV-2 and pH1N1 over time and investigate how varying sampling window and data set sizes affect the accuracy of parameter estimation. We further use a generalized McDonald-Kreitman test to estimate the number of segregating nonneutral sites over time. We find that the inferred evolutionary parameters for both pandemics are time dependent, and that the inferred rates of SARS-CoV-2 and pH1N1 decline by â¼50% and â¼100%, respectively, over the course of 1 year. After at least 4 months since the start of sequence sampling, inferred growth rates and emergence dates remain relatively stable and can be inferred reliably using a logistic growth coalescent model. We show that the time dependency of the mean substitution rate is due to elevated substitution rates at terminal branches which are 2-4 times higher than those of internal branches for both viruses. The elevated rate at terminal branches is strongly correlated with an increasing number of segregating nonneutral sites, demonstrating the role of purifying selection in generating the time dependency of evolutionary parameters during pandemics.
Assuntos
COVID-19 , Vírus da Influenza A Subtipo H1N1 , Influenza Humana , Teorema de Bayes , Humanos , Vírus da Influenza A Subtipo H1N1/genética , Influenza Humana/epidemiologia , Filogenia , SARS-CoV-2RESUMO
As viral genomic imprints in host genomes, endogenous viral elements (EVEs) shed light on the deep evolutionary history of viruses, ancestral host ranges, and ancient viral-host interactions. In addition, they may provide crucial information for calibrating viral evolutionary timescales. In this study, we conducted a comprehensive in silico screening of a large data set of available mammalian genomes for EVEs deriving from members of the viral family Flaviviridae, an important group of viruses including well-known human pathogens, such as Zika, dengue, or hepatitis C viruses. We identified two novel pestivirus-like EVEs in the reference genome of the Indochinese shrew (Crocidura indochinensis). Homologs of these novel EVEs were subsequently detected in vivo by molecular detection and sequencing in 27 shrew species, including 26 species representing a wide distribution within the Crocidurinae subfamily and one in the Soricinae subfamily on different continents. Based on this wide distribution, we estimate that the integration event occurred before the last common ancestor of the subfamily, about 10.8 million years ago, attesting to an ancient origin of pestiviruses and Flaviviridae in general. Moreover, we provide the first description of Flaviviridae-derived EVEs in mammals even though the family encompasses numerous mammal-infecting members. This also suggests that shrews were past and perhaps also current natural reservoirs of pestiviruses. Taken together, our results expand the current known Pestivirus host range and provide novel insight into the ancient evolutionary history of pestiviruses and the Flaviviridae family in general.
Assuntos
Pestivirus , Vírus , Infecção por Zika virus , Zika virus , Animais , Evolução Molecular , Genoma Viral , Humanos , Pestivirus/genética , Filogenia , Musaranhos/genética , Vírus/genética , Zika virus/genéticaRESUMO
Viruses emerging from wildlife can cause outbreaks in humans and domesticated animals. Predicting the emergence of future pathogens and mitigating their impacts requires an understanding of what shapes virus diversity and dynamics in wildlife reservoirs. In order to better understand coronavirus ecology in wild species, we sampled birds within a coastal freshwater lagoon habitat across 5 years, focussing on a large population of mute swans (Cygnus olor) and the diverse species that they interact with. We discovered and characterised the full genome of a divergent gammacoronavirus belonging to the Goose coronavirus CB17 species. We investigated the genetic diversity and dynamics of this gammacoronavirus using untargeted metagenomic sequencing of 223 faecal samples from swans of known age and sex, and RT-PCR screening of 1632 additional bird samples. The virus circulated persistently within the bird community; virus prevalence in mute swans exhibited seasonal variations, but did not change with swan age-class or epidemiological year. One whole genome was fully characterised, and revealed that the virus originated from a recombination event involving an undescribed gammacoronavirus species. Multiple lineages of this gammacoronavirus co-circulated within our study population. Viruses from this species have recently been detected in aquatic birds from both the Anatidae and Rallidae families, implying that host species habitat sharing may be important in shaping virus host range. As the host range of the Goose coronavirus CB17 species is not limited to geese, we propose that this species name should be updated to 'Waterbird gammacoronavirus 1'. Non-invasive sampling of bird coronaviruses may provide a tractable model system for understanding the evolutionary and cross-species dynamics of coronaviruses.
Assuntos
Anseriformes , Infecções por Coronavirus , Coronavirus , Gammacoronavirus , Humanos , Animais , Gammacoronavirus/genética , Coronavirus/genética , Surtos de Doenças , Infecções por Coronavirus/epidemiologia , Infecções por Coronavirus/veterinária , Animais Selvagens , Variação Genética , Recombinação GenéticaRESUMO
Viral discovery studies in wild animals often rely on cross-sectional surveys at a single time point. As a result, our understanding of the temporal stability of wild animal viromes remains poorly resolved. While studies of single host-virus systems indicate that host and environmental factors influence seasonal virus transmission dynamics, comparable insights for whole viral communities in multiple hosts are lacking. Utilizing noninvasive faecal samples from a long-term wild rodent study, we characterized viral communities of three common European rodent species (Apodemus sylvaticus, A. flavicollis and Myodes glareolus) living in temperate woodland over a single year. Our findings indicate that a substantial fraction of the rodent virome is seasonally transient and associated with vertebrate or bacteria hosts. Further analyses of one of the most common virus families, Picornaviridae, show pronounced temporal changes in viral richness and evenness, which were associated with concurrent and up to ~3-month lags in host density, ambient temperature, rainfall and humidity, suggesting complex feedbacks from the host and environmental factors on virus transmission and shedding in seasonal habitats. Overall, this study emphasizes the importance of understanding the seasonal dynamics of wild animal viromes in order to better predict and mitigate zoonotic risks.
Assuntos
Viroma , Animais , Estações do Ano , Estudos Transversais , Animais Selvagens , Arvicolinae , MurinaeRESUMO
B cells undergo rapid mutation and selection for antibody binding affinity when producing antibodies capable of neutralizing pathogens. This evolutionary process can be intermixed with migration between tissues, differentiation between cellular subsets, and switching between functional isotypes. B cell receptor (BCR) sequence data has the potential to elucidate important information about these processes. However, there is currently no robust, generalizable framework for making such inferences from BCR sequence data. To address this, we develop three parsimony-based summary statistics to characterize migration, differentiation, and isotype switching along B cell phylogenetic trees. We use simulations to demonstrate the effectiveness of this approach. We then use this framework to infer patterns of cellular differentiation and isotype switching from high throughput BCR sequence datasets obtained from patients in a study of HIV infection and a study of food allergy. These methods are implemented in the R package dowser, available at https://dowser.readthedocs.io.
Assuntos
Infecções por HIV , Switching de Imunoglobulina , Linfócitos B , Diferenciação Celular/genética , Humanos , Filogenia , Receptores de Antígenos de Linfócitos B/genéticaRESUMO
The live poultry trade is thought to play an important role in the spread and maintenance of highly pathogenic avian influenza A viruses (HP AIVs) in Asia. Despite an abundance of small-scale observational studies, the role of the poultry trade in disseminating AIV over large geographic areas is still unclear, especially for developing countries with complex poultry production systems. Here we combine virus genomes and reconstructed poultry transportation data to measure and compare the spatial spread in China of three key subtypes of AIV: H5N1, H7N9, and H5N6. Although it is difficult to disentangle the contribution of confounding factors, such as bird migration and spatial distance, we find evidence that the dissemination of these subtypes among domestic poultry is geographically continuous and likely associated with the intensity of the live poultry trade in China. Using two independent data sources and network analysis methods, we report a regional-scale community structure in China that might explain the spread of AIV subtypes in the country. The identification of this structure has the potential to inform more targeted strategies for the prevention and control of AIV in China.
Assuntos
Influenza Aviária/epidemiologia , Influenza Aviária/transmissão , Influenza Aviária/virologia , Aves Domésticas/virologia , Animais , China/epidemiologia , Genoma Viral , Humanos , Virus da Influenza A Subtipo H5N1 , Subtipo H7N9 do Vírus da Influenza A , Filogeografia , Meios de TransporteRESUMO
BACKGROUND: More than 2 million SARS-CoV-2 genome sequences have been generated and shared since the start of the COVID-19 pandemic and constitute a vital information source that informs outbreak control, disease surveillance, and public health policy. The Pango dynamic nomenclature is a popular system for classifying and naming genetically-distinct lineages of SARS-CoV-2, including variants of concern, and is based on the analysis of complete or near-complete virus genomes. However, for several reasons, nucleotide sequences may be generated that cover only the spike gene of SARS-CoV-2. It is therefore important to understand how much information about Pango lineage status is contained in spike-only nucleotide sequences. Here we explore how Pango lineages might be reliably designated and assigned to spike-only nucleotide sequences. We survey the genetic diversity of such sequences, and investigate the information they contain about Pango lineage status. RESULTS: Although many lineages, including the main variants of concern, can be identified clearly using spike-only sequences, some spike-only sequences are shared among tens or hundreds of Pango lineages. To facilitate the classification of SARS-CoV-2 lineages using subgenomic sequences we introduce the notion of designating such sequences to a "lineage set", which represents the range of Pango lineages that are consistent with the observed mutations in a given spike sequence. CONCLUSIONS: We find that many lineages, including the main variants-of-concern, can be reliably identified by spike alone and we define lineage-sets to represent the lineage precision that can be achieved using spike-only nucleotide sequences. These data provide a foundation for the development of software tools that can assign newly-generated spike nucleotide sequences to Pango lineage sets.
Assuntos
COVID-19 , SARS-CoV-2 , Sequência de Bases , Humanos , Mutação , Pandemias , Filogenia , Glicoproteína da Espícula de Coronavírus/genéticaRESUMO
Limited genomic sampling in many high-incidence countries has impeded studies of severe respiratory syndrome coronavirus 2 (SARS-CoV-2) genomic epidemiology. Consequently, critical questions remain about the generation and global distribution of virus genetic diversity. We investigated SARS-CoV-2 transmission dynamics in Gujarat, India, during the state's first epidemic wave to shed light on spread of the virus in one of the regions hardest hit by the pandemic. By integrating case data and 434 whole-genome sequences sampled across 20 districts, we reconstructed the epidemic dynamics and spatial spread of SARS-CoV-2 in Gujarat. Our findings indicate global and regional connectivity and population density were major drivers of the Gujarat outbreak. We detected >100 virus lineage introductions, most of which appear to be associated with international travel. Within Gujarat, virus dissemination occurred predominantly from densely populated regions to geographically proximate locations that had low population density, suggesting that urban centers contributed disproportionately to virus spread.
Assuntos
COVID-19 , SARS-CoV-2 , COVID-19/epidemiologia , Genoma Viral , Genômica , Humanos , Índia/epidemiologia , Filogenia , SARS-CoV-2/genéticaRESUMO
Spatially explicit phylogeographic analyses can be performed with an inference framework that employs relaxed random walks to reconstruct phylogenetic dispersal histories in continuous space. This core model was first implemented 10 years ago and has opened up new opportunities in the field of phylodynamics, allowing researchers to map and analyze the spatial dissemination of rapidly evolving pathogens. We here provide a detailed and step-by-step guide on how to set up, run, and interpret continuous phylogeographic analyses using the programs BEAUti, BEAST, Tracer, and TreeAnnotator.
Assuntos
Filogeografia/métodos , Software , Teorema de Bayes , Evolução BiológicaRESUMO
São Paulo, a densely inhabited state in southeast Brazil that contains the fourth most populated city in the world, recently experienced its largest yellow fever virus (YFV) outbreak in decades. YFV does not normally circulate extensively in São Paulo, so most people were unvaccinated when the outbreak began. Surveillance in non-human primates (NHPs) is important for determining the magnitude and geographic extent of an epizootic, thereby helping to evaluate the risk of YFV spillover to humans. Data from infected NHPs can give more accurate insights into YFV spread than when using data from human cases alone. To contextualise human cases, identify epizootic foci and uncover the rate and direction of YFV spread in São Paulo, we generated and analysed virus genomic data and epizootic case data from NHPs in São Paulo. We report the occurrence of three spatiotemporally distinct phases of the outbreak in São Paulo prior to February 2018. We generated 51 new virus genomes from YFV positive cases identified in 23 different municipalities in São Paulo, mostly sampled from NHPs between October 2016 and January 2018. Although we observe substantial heterogeneity in lineage dispersal velocities between phylogenetic branches, continuous phylogeographic analyses of generated YFV genomes suggest that YFV lineages spread in São Paulo at a mean rate of approximately 1km per day during all phases of the outbreak. Viral lineages from the first epizootic phase in northern São Paulo subsequently dispersed towards the south of the state to cause the second and third epizootic phases there. This alters our understanding of how YFV was introduced into the densely populated south of São Paulo state. Our results shed light on the sylvatic transmission of YFV in highly fragmented forested regions in São Paulo state and highlight the importance of continued surveillance of zoonotic pathogens in sentinel species.
Assuntos
Genoma Viral , Doenças dos Primatas/virologia , Febre Amarela/veterinária , Febre Amarela/virologia , Vírus da Febre Amarela/genética , Zoonoses/virologia , Animais , Brasil/epidemiologia , Surtos de Doenças , Genômica , Humanos , Filogenia , Filogeografia , Doenças dos Primatas/epidemiologia , Doenças dos Primatas/transmissão , Primatas/virologia , Febre Amarela/epidemiologia , Febre Amarela/transmissão , Vírus da Febre Amarela/classificação , Vírus da Febre Amarela/isolamento & purificação , Zoonoses/epidemiologia , Zoonoses/transmissãoRESUMO
In Bayesian phylogenetics, the coalescent process provides an informative framework for inferring changes in the effective size of a population from a phylogeny (or tree) of sequences sampled from that population. Popular coalescent inference approaches such as the Bayesian Skyline Plot, Skyride, and Skygrid all model these population size changes with a discontinuous, piecewise-constant function but then apply a smoothing prior to ensure that their posterior population size estimates transition gradually with time. These prior distributions implicitly encode extra population size information that is not available from the observed coalescent data or tree. Here, we present a novel statistic, $\Omega$, to quantify and disaggregate the relative contributions of the coalescent data and prior assumptions to the resulting posterior estimate precision. Our statistic also measures the additional mutual information introduced by such priors. Using $\Omega$ we show that, because it is surprisingly easy to overparametrize piecewise-constant population models, common smoothing priors can lead to overconfident and potentially misleading inference, even under robust experimental designs. We propose $\Omega$ as a useful tool for detecting when effective population size estimates are overly reliant on prior assumptions and for improving quantification of the uncertainty in those estimates.[Coalescent processes; effective population size; information theory; phylodynamics; prior assumptions; skyline plots.].