RESUMEN
Global dispersal and increasing frequency of the SARS-CoV-2 spike protein variant D614G are suggestive of a selective advantage but may also be due to a random founder effect. We investigate the hypothesis for positive selection of spike D614G in the United Kingdom using more than 25,000 whole genome SARS-CoV-2 sequences. Despite the availability of a large dataset, well represented by both spike 614 variants, not all approaches showed a conclusive signal of positive selection. Population genetic analysis indicates that 614G increases in frequency relative to 614D in a manner consistent with a selective advantage. We do not find any indication that patients infected with the spike 614G variant have higher COVID-19 mortality or clinical severity, but 614G is associated with higher viral load and younger age of patients. Significant differences in growth and size of 614G phylogenetic clusters indicate a need for continued study of this variant.
Asunto(s)
Sustitución de Aminoácidos , COVID-19/transmisión , COVID-19/virología , SARS-CoV-2/genética , SARS-CoV-2/patogenicidad , Glicoproteína de la Espiga del Coronavirus/genética , Ácido Aspártico/análisis , Ácido Aspártico/genética , COVID-19/epidemiología , Genoma Viral , Glicina/análisis , Glicina/genética , Humanos , Mutación , SARS-CoV-2/crecimiento & desarrollo , Reino Unido/epidemiología , Virulencia , Secuenciación Completa del GenomaRESUMEN
Phylogenetic dating is one of the most powerful and commonly used methods of drawing epidemiological interpretations from pathogen genomic data. Building such trees requires considering a molecular clock model which represents the rate at which substitutions accumulate on genomes. When the molecular clock rate is constant throughout the tree then the clock is said to be strict, but this is often not an acceptable assumption. Alternatively, relaxed clock models consider variations in the clock rate, often based on a distribution of rates for each branch. However, we show here that the distributions of rates across branches in commonly used relaxed clock models are incompatible with the biological expectation that the sum of the numbers of substitutions on two neighboring branches should be distributed as the substitution number on a single branch of equivalent length. We call this expectation the additivity property. We further show how assumptions of commonly used relaxed clock models can lead to estimates of evolutionary rates and dates with low precision and biased confidence intervals. We therefore propose a new additive relaxed clock model where the additivity property is satisfied. We illustrate the use of our new additive relaxed clock model on a range of simulated and real data sets, and we show that using this new model leads to more accurate estimates of mean evolutionary rates and ancestral dates.
Asunto(s)
Evolución Molecular , Genoma Bacteriano , Modelos Genéticos , Filogenia , MutaciónRESUMEN
Elaboration of Bayesian phylogenetic inference methods has continued at pace in recent years with major new advances in nearly all aspects of the joint modelling of evolutionary data. It is increasingly appreciated that some evolutionary questions can only be adequately answered by combining evidence from multiple independent sources of data, including genome sequences, sampling dates, phenotypic data, radiocarbon dates, fossil occurrences, and biogeographic range information among others. Including all relevant data into a single joint model is very challenging both conceptually and computationally. Advanced computational software packages that allow robust development of compatible (sub-)models which can be composed into a full model hierarchy have played a key role in these developments. Developing such software frameworks is increasingly a major scientific activity in its own right, and comes with specific challenges, from practical software design, development and engineering challenges to statistical and conceptual modelling challenges. BEAST 2 is one such computational software platform, and was first announced over 4 years ago. Here we describe a series of major new developments in the BEAST 2 core platform and model hierarchy that have occurred since the first release of the software, culminating in the recent 2.5 release.
Asunto(s)
Teorema de Bayes , Evolución Biológica , Filogenia , Programas Informáticos , Animales , Biología Computacional , Simulación por Computador , Evolución Molecular , Humanos , Cadenas de Markov , Modelos Genéticos , Método de MontecarloRESUMEN
Population genetic modeling can enhance Bayesian phylogenetic inference by providing a realistic prior on the distribution of branch lengths and times of common ancestry. The parameters of a population genetic model may also have intrinsic importance, and simultaneous estimation of a phylogeny and model parameters has enabled phylodynamic inference of population growth rates, reproduction numbers, and effective population size through time. Phylodynamic inference based on pathogen genetic sequence data has emerged as useful supplement to epidemic surveillance, however commonly-used mechanistic models that are typically fitted to non-genetic surveillance data are rarely fitted to pathogen genetic data due to a dearth of software tools, and the theory required to conduct such inference has been developed only recently. We present a framework for coalescent-based phylogenetic and phylodynamic inference which enables highly-flexible modeling of demographic and epidemiological processes. This approach builds upon previous structured coalescent approaches and includes enhancements for computational speed, accuracy, and stability. A flexible markup language is described for translating parametric demographic or epidemiological models into a structured coalescent model enabling simultaneous estimation of demographic or epidemiological parameters and time-scaled phylogenies. We demonstrate the utility of these approaches by fitting compartmental epidemiological models to Ebola virus and Influenza A virus sequence data, demonstrating how important features of these epidemics, such as the reproduction number and epidemic curves, can be gleaned from genetic data. These approaches are provided as an open-source package PhyDyn for the BEAST2 phylogenetics platform.
Asunto(s)
Teorema de Bayes , Modelos Teóricos , Filogenia , África Occidental/epidemiología , Simulación por Computador , Epidemias , Genética de Población , Fiebre Hemorrágica Ebola/epidemiología , Humanos , Gripe Humana/epidemiología , Vigilancia de la Población , Estaciones del Año , Diseño de SoftwareRESUMEN
Unprecedented public health interventions including travel restrictions and national lockdowns have been implemented to stem the COVID-19 epidemic, but the effectiveness of non-pharmaceutical interventions is still debated. We carried out a phylogenetic analysis of more than 29,000 publicly available whole genome SARS-CoV-2 sequences from 57 locations to estimate the time that the epidemic originated in different places. These estimates were examined in relation to the dates of the most stringent interventions in each location as well as to the number of cumulative COVID-19 deaths and phylodynamic estimates of epidemic size. Here we report that the time elapsed between epidemic origin and maximum intervention is associated with different measures of epidemic severity and explains 11% of the variance in reported deaths one month after the most stringent intervention. Locations where strong non-pharmaceutical interventions were implemented earlier experienced much less severe COVID-19 morbidity and mortality during the period of study.
Asunto(s)
COVID-19/diagnóstico , Control de Enfermedades Transmisibles/métodos , Filogenia , Filogeografía/métodos , SARS-CoV-2/genética , COVID-19/epidemiología , COVID-19/virología , Epidemias , Humanos , Salud Pública/métodos , Salud Pública/estadística & datos numéricos , SARS-CoV-2/clasificación , SARS-CoV-2/fisiología , Índice de Severidad de la EnfermedadRESUMEN
Analysis of genetic sequence data from the SARS-CoV-2 pandemic can provide insights into epidemic origins, worldwide dispersal, and epidemiological history. With few exceptions, genomic epidemiological analysis has focused on geographically distributed data sets with few isolates in any given location. Here, we report an analysis of 20 whole SARS- CoV-2 genomes from a single relatively small and geographically constrained outbreak in Weifang, People's Republic of China. Using Bayesian model-based phylodynamic methods, we estimate a mean basic reproduction number (R 0) of 3.4 (95% highest posterior density interval: 2.1-5.2) in Weifang, and a mean effective reproduction number (Rt) that falls below 1 on 4 February. We further estimate the number of infections through time and compare these estimates to confirmed diagnoses by the Weifang Centers for Disease Control. We find that these estimates are consistent with reported cases and there is unlikely to be a large undiagnosed burden of infection over the period we studied.
RESUMEN
As of 1st June 2020, the US Centres for Disease Control and Prevention reported 104,232 confirmed or probable COVID-19-related deaths in the US. This was more than twice the number of deaths reported in the next most severely impacted country. We jointly model the US epidemic at the state-level, using publicly available death data within a Bayesian hierarchical semi-mechanistic framework. For each state, we estimate the number of individuals that have been infected, the number of individuals that are currently infectious and the time-varying reproduction number (the average number of secondary infections caused by an infected person). We use changes in mobility to capture the impact that non-pharmaceutical interventions and other behaviour changes have on the rate of transmission of SARS-CoV-2. We estimate that Rt was only below one in 23 states on 1st June. We also estimate that 3.7% [3.4%-4.0%] of the total population of the US had been infected, with wide variation between states, and approximately 0.01% of the population was infectious. We demonstrate good 3 week model forecasts of deaths with low error and good coverage of our credible intervals.
Asunto(s)
COVID-19/epidemiología , Pandemias/estadística & datos numéricos , Teorema de Bayes , COVID-19/transmisión , Humanos , Modelos Estadísticos , Estados Unidos/epidemiología , Virosis/epidemiologíaRESUMEN
Background: The COVID-19 epidemic was declared a Global Pandemic by WHO on 11 March 2020. By 24 March 2020, over 440,000 cases and almost 20,000 deaths had been reported worldwide. In response to the fast-growing epidemic, which began in the Chinese city of Wuhan, Hubei, China imposed strict social distancing in Wuhan on 23 January 2020 followed closely by similar measures in other provinces. These interventions have impacted economic productivity in China, and the ability of the Chinese economy to resume without restarting the epidemic was not clear. Methods: Using daily reported cases from mainland China and Hong Kong SAR, we estimated transmissibility over time and compared it to daily within-city movement, as a proxy for economic activity. Results: Initially, within-city movement and transmission were very strongly correlated in the five mainland provinces most affected by the epidemic and Beijing. However, that correlation decreased rapidly after the initial sharp fall in transmissibility. In general, towards the end of the study period, the correlation was no longer apparent, despite substantial increases in within-city movement. A similar analysis for Hong Kong shows that intermediate levels of local activity were maintained while avoiding a large outbreak. At the very end of the study period, when China began to experience the re-introduction of a small number of cases from Europe and the United States, there is an apparent up-tick in transmission. Conclusions: Although these results do not preclude future substantial increases in incidence, they suggest that after very intense social distancing (which resulted in containment), China successfully exited its lockdown to some degree. Elsewhere, movement data are being used as proxies for economic activity to assess the impact of interventions. The results presented here illustrate how the eventual decorrelation between transmission and movement is likely a key feature of successful COVID-19 exit strategies.