Your browser doesn't support javascript.
loading
: 20 | 50 | 100
1 - 12 de 12
1.
bioRxiv ; 2024 Apr 30.
Article En | MEDLINE | ID: mdl-38746185

The SARS-CoV-2 genome occupies a unique place in infection biology - it is the most highly sequenced genome on earth (making up over 20% of public sequencing datasets) with fine scale information on sampling date and geography, and has been subject to unprecedented intense analysis. As a result, these phylogenetic data are an incredibly valuable resource for science and public health. However, the vast majority of the data was sequenced by tiling amplicons across the full genome, with amplicon schemes that changed over the pandemic as mutations in the viral genome interacted with primer binding sites. In combination with the disparate set of genome assembly workflows and lack of consistent quality control (QC) processes, the current genomes have many systematic errors that have evolved with the virus and amplicon schemes. These errors have significant impacts on the phylogeny, and therefore over the last few years, many thousands of hours of researchers time has been spent in "eyeballing" trees, looking for artefacts, and then patching the tree. Given the huge value of this dataset, we therefore set out to reprocess the complete set of public raw sequence data in a rigorous amplicon-aware manner, and build a cleaner phylogeny. Here we provide a global tree of 3,960,704 samples, built from a consistently assembled set of high quality consensus sequences from all available public data as of March 2023, viewable at https://viridian.taxonium.org. Each genome was constructed using a novel assembly tool called Viridian (https://github.com/iqbal-lab-org/viridian), developed specifically to process amplicon sequence data, eliminating artefactual errors and mask the genome at low quality positions. We provide simulation and empirical validation of the methodology, and quantify the improvement in the phylogeny. Phase 2 of our project will address the fact that the data in the public archives is heavily geographically biased towards the Global North. We therefore have contributed new raw data to ENA/SRA from many countries including Ghana, Thailand, Laos, Sri Lanka, India, Argentina and Singapore. We will incorporate these, along with all public raw data submitted between March 2023 and the current day, into an updated set of assemblies, and phylogeny. We hope the tree, consensus sequences and Viridian will be a valuable resource for researchers.

2.
Nat Commun ; 15(1): 2175, 2024 Mar 11.
Article En | MEDLINE | ID: mdl-38467646

In the ENSEMBLE randomized, placebo-controlled phase 3 trial (NCT04505722), estimated single-dose Ad26.COV2.S vaccine efficacy (VE) was 56% against moderate to severe-critical COVID-19. SARS-CoV-2 Spike sequences were determined from 484 vaccine and 1,067 placebo recipients who acquired COVID-19. In this set of prespecified analyses, we show that in Latin America, VE was significantly lower against Lambda vs. Reference and against Lambda vs. non-Lambda [family-wise error rate (FWER) p < 0.05]. VE differed by residue match vs. mismatch to the vaccine-insert at 16 amino acid positions (4 FWER p < 0.05; 12 q-value ≤ 0.20); significantly decreased with physicochemical-weighted Hamming distance to the vaccine-strain sequence for Spike, receptor-binding domain, N-terminal domain, and S1 (FWER p < 0.001); differed (FWER ≤ 0.05) by distance to the vaccine strain measured by 9 antibody-epitope escape scores and 4 NTD neutralization-impacting features; and decreased (p = 0.011) with neutralization resistance level to vaccinee sera. VE against severe-critical COVID-19 was stable across most sequence features but lower against the most distant viruses.


Ad26COVS1 , COVID-19 , Humans , COVID-19/prevention & control , SARS-CoV-2 , Vaccine Efficacy , Amino Acids , Antibodies, Viral , Antibodies, Neutralizing
3.
Emerg Infect Dis ; 29(5)2023 05.
Article En | MEDLINE | ID: mdl-37054986

Since late 2020, SARS-CoV-2 variants have regularly emerged with competitive and phenotypic differences from previously circulating strains, sometimes with the potential to escape from immunity produced by prior exposure and infection. The Early Detection group is one of the constituent groups of the US National Institutes of Health National Institute of Allergy and Infectious Diseases SARS-CoV-2 Assessment of Viral Evolution program. The group uses bioinformatic methods to monitor the emergence, spread, and potential phenotypic properties of emerging and circulating strains to identify the most relevant variants for experimental groups within the program to phenotypically characterize. Since April 2021, the group has prioritized variants monthly. Prioritization successes include rapidly identifying most major variants of SARS-CoV-2 and providing experimental groups within the National Institutes of Health program easy access to regularly updated information on the recent evolution and epidemiology of SARS-CoV-2 that can be used to guide phenotypic investigations.


COVID-19 , SARS-CoV-2 , United States/epidemiology , Humans , SARS-CoV-2/genetics , COVID-19/epidemiology , National Institutes of Health (U.S.)
4.
Nature ; 605(7911): 640-652, 2022 05.
Article En | MEDLINE | ID: mdl-35361968

The global emergence of many severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) variants jeopardizes the protective antiviral immunity induced after infection or vaccination. To address the public health threat caused by the increasing SARS-CoV-2 genomic diversity, the National Institute of Allergy and Infectious Diseases within the National Institutes of Health established the SARS-CoV-2 Assessment of Viral Evolution (SAVE) programme. This effort was designed to provide a real-time risk assessment of SARS-CoV-2 variants that could potentially affect the transmission, virulence, and resistance to infection- and vaccine-induced immunity. The SAVE programme is a critical data-generating component of the US Government SARS-CoV-2 Interagency Group to assess implications of SARS-CoV-2 variants on diagnostics, vaccines and therapeutics, and for communicating public health risk. Here we describe the coordinated approach used to identify and curate data about emerging variants, their impact on immunity and effects on vaccine protection using animal models. We report the development of reagents, methodologies, models and notable findings facilitated by this collaborative approach and identify future challenges. This programme is a template for the response to rapidly evolving pathogens with pandemic potential by monitoring viral evolution in the human population to identify variants that could reduce the effectiveness of countermeasures.


COVID-19 , SARS-CoV-2 , Animals , Biological Evolution , COVID-19 Vaccines , Humans , National Institute of Allergy and Infectious Diseases (U.S.) , Pandemics/prevention & control , Pharmacogenomic Variants , SARS-CoV-2/genetics , SARS-CoV-2/pathogenicity , United States/epidemiology , Virulence
5.
PLoS Pathog ; 18(3): e1010369, 2022 03.
Article En | MEDLINE | ID: mdl-35303045

Eliciting broadly neutralizing antibodies (bnAbs) is a cornerstone of HIV-1 vaccine strategies. Comparing HIV-1 envelope (env) sequences from the first weeks of infection to the breadth of antibody responses observed several years after infection can help define viral features critical to vaccine design. We investigated the relationship between HIV-1 env genetics and the development of neutralization breadth in 70 individuals enrolled in a prospective acute HIV-1 cohort. Half of the individuals who developed bnAbs were infected with multiple HIV-1 founder variants, whereas all individuals with limited neutralization breadth had been infected with single HIV-1 founders. Accordingly, at HIV-1 diagnosis, env diversity was significantly higher in participants who later developed bnAbs compared to those with limited breadth (p = 0.012). This association between founder multiplicity and the subsequent development of neutralization breadth was also observed in 56 placebo recipients in the RV144 vaccine efficacy trial. In addition, we found no evidence that neutralization breath was heritable when analyzing env sequences from the 126 participants. These results demonstrate that the presence of slightly different HIV-1 variants in acute infection could promote the induction of bnAbs, suggesting a novel vaccine strategy, whereby an initial immunization with a cocktail of minimally distant antigens would be able to initiate bnAb development towards breadth.


HIV-1 , Antibodies, Neutralizing , Epitopes , HIV Antibodies , HIV-1/genetics , Humans , Prospective Studies , env Gene Products, Human Immunodeficiency Virus/genetics
6.
J R Soc Interface ; 18(179): 20210314, 2021 06.
Article En | MEDLINE | ID: mdl-34186015

Clinical trials for HIV prevention can require knowledge of infection times to subsequently determine protective drug levels. Yet, infection timing is difficult when study visits are sparse. Using population nonlinear mixed-effects (pNLME) statistical inference and viral loads from 46 RV217 study participants, we developed a relatively simple HIV primary infection model that achieved an excellent fit to all data. We also discovered that Aptima assay values from the study strongly correlated with viral loads, enabling imputation of very early viral loads for 28/46 participants. Estimated times between infecting exposures and first positives were generally longer than prior estimates (average of two weeks) and were robust to missing viral upslope data. On simulated data, we found that tighter sampling before diagnosis improved estimation more than tighter sampling after diagnosis. Sampling weekly before and monthly after diagnosis was a pragmatic design for good timing accuracy. Our pNLME timing approach is widely applicable to other infections with existing mathematical models. The present model could be used to simulate future HIV trials and may help estimate protective thresholds from the recently completed antibody-mediated prevention trials.


HIV Infections , HIV-1 , HIV Infections/epidemiology , Humans , Models, Theoretical , Viral Load
8.
Virus Evol ; 3(2): vex020, 2017 Jul.
Article En | MEDLINE | ID: mdl-28852573

Phylogenetic methods are being increasingly used to help understand the transmission dynamics of measurably evolving viruses, including HIV. Clusters of highly similar sequences are often observed, which appear to follow a 'power law' behaviour, with a small number of very large clusters. These clusters may help to identify subpopulations in an epidemic, and inform where intervention strategies should be implemented. However, clustering of samples does not necessarily imply the presence of a subpopulation with high transmission rates, as groups of closely related viruses can also occur due to non-epidemiological effects such as over-sampling. It is important to ensure that observed phylogenetic clustering reflects true heterogeneity in the transmitting population, and is not being driven by non-epidemiological effects. We qualify the effect of using a falsely identified 'transmission cluster' of sequences to estimate phylodynamic parameters including the effective population size and exponential growth rate under several demographic scenarios. Our simulation studies show that taking the maximum size cluster to re-estimate parameters from trees simulated under a randomly mixing, constant population size coalescent process systematically underestimates the overall effective population size. In addition, the transmission cluster wrongly resembles an exponential or logistic growth model 99% of the time. We also illustrate the consequences of false clusters in exponentially growing coalescent and birth-death trees, where again, the growth rate is skewed upwards. This has clear implications for identifying clusters in large viral databases, where a false cluster could result in wasted intervention resources.

9.
J Virol ; 91(9)2017 05 01.
Article En | MEDLINE | ID: mdl-28202767

Hepatitis E virus (HEV) is the most common cause of acute viral hepatitis globally. HEV comprises four genotypes with different geographic distributions and host ranges. We utilize this natural case-control study for investigating the evolution of zoonotic viruses compared to single-host viruses, using 244 near-full-length HEV genomes. Genome-wide estimates of the ratio of nonsynonymous to synonymous evolutionary changes (dN/dS ratio) located a region of overlapping reading frames, which is subject to positive selection in genotypes 3 and 4. The open reading frames (ORFs) involved have functions related to host-pathogen interaction, so genotype-specific evolution of these regions may reflect their fitness. Bayesian inference of evolutionary rates shows that genotypes 3 and 4 have significantly higher rates than genotype 1 across all ORFs. Reconstruction of the phylogenies of zoonotic genotypes demonstrates significant intermingling of isolates between hosts. We speculate that the genotype-specific differences may result from cyclical adaptation to different hosts in genotypes 3 and 4.IMPORTANCE Hepatitis E virus (HEV) is increasingly recognized as a pathogen that affects both the developing and the developed world. While most often clinically mild, HEV can be severe or fatal in certain demographics, such as expectant mothers. Like many other viral pathogens, HEV has been classified into several distinct genotypes. We show that most of the HEV genome is evolutionarily constrained. One locus of positive selection is unusual in that it encodes two distinct protein products. We are the first to detect positive selection in this overlap region. Genotype 1, which infects humans only, appears to be evolving differently from genotypes 3 and 4, which infect multiple species, possibly because genotypes 3 and 4 are unable to achieve the same fitness due to repeated host jumps.


Biological Evolution , Genome, Viral/genetics , Hepatitis E virus/genetics , Host Specificity/genetics , Host-Pathogen Interactions/genetics , Animals , Base Sequence , Case-Control Studies , Genotype , Hepatitis E/virology , Hepatitis E virus/isolation & purification , Humans , Open Reading Frames/genetics , Phylogeny , Sequence Analysis, DNA , Swine , Zoonoses/virology
10.
ISME J ; 10(3): 721-9, 2016 Mar.
Article En | MEDLINE | ID: mdl-26305157

Campylobacter jejuni and Campylobacter coli are the biggest causes of bacterial gastroenteritis in the developed world, with human infections typically arising from zoonotic transmission associated with infected meat. Because Campylobacter is not thought to survive well outside the gut, host-associated populations are genetically isolated to varying degrees. Therefore, the likely origin of most strains can be determined by host-associated variation in the genome. This is instructive for characterizing the source of human infection. However, some common strains, notably isolates belonging to the ST-21, ST-45 and ST-828 clonal complexes, appear to have broad host ranges, hindering source attribution. Here whole-genome sequencing has the potential to reveal fine-scale genetic structure associated with host specificity. We found that rates of zoonotic transmission among animal host species in these clonal complexes were so high that the signal of host association is all but obliterated, estimating one zoonotic transmission event every 1.6, 1.8 and 12 years in the ST-21, ST-45 and ST828 complexes, respectively. We attributed 89% of clinical cases to a chicken source, 10% to cattle and 1% to pig. Our results reveal that common strains of C. jejuni and C. coli infectious to humans are adapted to a generalist lifestyle, permitting rapid transmission between different hosts. Furthermore, they show that the weak signal of host association within these complexes presents a challenge for pinpointing the source of clinical infections, underlining the view that whole-genome sequencing, powerful though it is, cannot substitute for intensive sampling of suspected transmission reservoirs.


Campylobacter Infections/microbiology , Campylobacter Infections/veterinary , Campylobacter/isolation & purification , Animals , Campylobacter/classification , Campylobacter/genetics , Campylobacter/physiology , Campylobacter Infections/transmission , Cattle , Cattle Diseases/microbiology , Cattle Diseases/transmission , Chickens , Host Specificity , Humans , Poultry Diseases/microbiology , Poultry Diseases/transmission , Swine , Swine Diseases/microbiology , Swine Diseases/transmission
11.
PLoS Comput Biol ; 11(7): e1004312, 2015 Jul.
Article En | MEDLINE | ID: mdl-26147205

Previous work has shown that asymmetry in viral phylogenies may be indicative of heterogeneity in transmission, for example due to acute HIV infection or the presence of 'core groups' with higher contact rates. Hence, evidence of asymmetry may provide clues to underlying population structure, even when direct information on, for example, stage of infection or contact rates, are missing. However, current tests of phylogenetic asymmetry (a) suffer from false positives when the tips of the phylogeny are sampled at different times and (b) only test for global asymmetry, and hence suffer from false negatives when asymmetry is localised to part of a phylogeny. We present a simple permutation-based approach for testing for asymmetry in a phylogeny, where we compare the observed phylogeny with random phylogenies with the same sampling and coalescence times, to reduce the false positive rate. We also demonstrate how profiles of measures of asymmetry calculated over a range of evolutionary times in the phylogeny can be used to identify local asymmetry. In combination with different metrics of asymmetry, this combined approach offers detailed insights of how phylogenies reconstructed from real viral datasets may deviate from the simplistic assumptions of commonly used coalescent and birth-death process models.


Biological Evolution , Genetics, Population , Models, Genetic , Mutation/genetics , Phylogeny , Viruses/genetics , Computer Simulation , Genetic Speciation , Genetic Variation/genetics , Models, Statistical , Pedigree
12.
Virol J ; 10: 335, 2013 Nov 13.
Article En | MEDLINE | ID: mdl-24220146

BACKGROUND: Norovirus is the commonest cause of epidemic gastroenteritis among people of all ages. Outbreaks frequently occur in hospitals and the community, costing the UK an estimated £110 m per annum. An evolutionary explanation for periodic increases in norovirus cases, despite some host-specific post immunity is currently limited to the identification of obvious recombinants. Our understanding could be significantly enhanced by full length genome sequences for large numbers of intensively sampled viruses, which would also assist control and vaccine design. Our objective is to develop rapid, high-throughput, end-to-end methods yielding complete norovirus genome sequences. We apply these methods to recent English outbreaks, placing them in the wider context of the international norovirus epidemic of winter 2012. METHOD: Norovirus sequences were generated from 28 unique clinical samples by Illumina RNA sequencing (RNA-Seq) of total faecal RNA. A range of de novo sequence assemblers were attempted. The best assembler was identified by validation against three replicate samples and two norovirus qPCR negative samples, together with an additional 20 sequences determined by PCR and fractional capillary sequencing. Phylogenetic methods were used to reconstruct evolutionary relationships from the whole genome sequences. RESULTS: Full length norovirus genomes were generated from 23/28 samples. 5/28 partial norovirus genomes were associated with low viral copy numbers. The de novo assembled sequences differed from sequences determined by capillary sequencing by <0.003%. Intra-host nucleotide sequence diversity was rare, but detectable by mapping short sequence reads onto its de novo assembled consensus. Genomes similar to the Sydney 2012 strain caused 78% (18/23) of cases, consistent with its previously documented association with the winter 2012 global outbreak. Interestingly, phylogenetic analysis and recombination detection analysis of the consensus sequences identified two related viruses as recombinants, containing sequences in prior circulation to Sydney 2012 in open reading frame (ORF) 2. CONCLUSION: Our approach facilitates the rapid determination of complete norovirus genomes. This method provides high resolution of full norovirus genomes which, when coupled with detailed epidemiology, may improve the understanding of evolution and control of this important healthcare-associated pathogen.


Caliciviridae Infections/epidemiology , Caliciviridae Infections/virology , Disease Outbreaks , Genome, Viral , Norovirus/classification , Norovirus/genetics , Sequence Analysis, DNA , Cluster Analysis , England/epidemiology , Humans , Molecular Sequence Data , Norovirus/isolation & purification , Phylogeny , RNA, Viral/genetics , Sequence Homology
...