ABSTRACT
An outbreak of over 1,000 COVID-19 cases in Provincetown, Massachusetts (MA), in July 2021-the first large outbreak mostly in vaccinated individuals in the US-prompted a comprehensive public health response, motivating changes to national masking recommendations and raising questions about infection and transmission among vaccinated individuals. To address these questions, we combined viral genomic and epidemiological data from 467 individuals, including 40% of outbreak-associated cases. The Delta variant accounted for 99% of cases in this dataset; it was introduced from at least 40 sources, but 83% of cases derived from a single source, likely through transmission across multiple settings over a short time rather than a single event. Genomic and epidemiological data supported multiple transmissions of Delta from and between fully vaccinated individuals. However, despite its magnitude, the outbreak had limited onward impact in MA and the US overall, likely due to high vaccination rates and a robust public health response.
Subject(s)
COVID-19/epidemiology , COVID-19/immunology , COVID-19/transmission , SARS-CoV-2/genetics , SARS-CoV-2/immunology , Adolescent , Adult , Aged , Aged, 80 and over , COVID-19/virology , Child , Child, Preschool , Contact Tracing/methods , Disease Outbreaks , Female , Genome, Viral , Humans , Infant , Infant, Newborn , Male , Massachusetts/epidemiology , Middle Aged , Molecular Epidemiology , Phylogeny , SARS-CoV-2/classification , Vaccination , Whole Genome Sequencing , Young AdultABSTRACT
T cell-mediated immunity plays an important role in controlling SARS-CoV-2 infection, but the repertoire of naturally processed and presented viral epitopes on class I human leukocyte antigen (HLA-I) remains uncharacterized. Here, we report the first HLA-I immunopeptidome of SARS-CoV-2 in two cell lines at different times post infection using mass spectrometry. We found HLA-I peptides derived not only from canonical open reading frames (ORFs) but also from internal out-of-frame ORFs in spike and nucleocapsid not captured by current vaccines. Some peptides from out-of-frame ORFs elicited T cell responses in a humanized mouse model and individuals with COVID-19 that exceeded responses to canonical peptides, including some of the strongest epitopes reported to date. Whole-proteome analysis of infected cells revealed that early expressed viral proteins contribute more to HLA-I presentation and immunogenicity. These biological insights, as well as the discovery of out-of-frame ORF epitopes, will facilitate selection of peptides for immune monitoring and vaccine development.
Subject(s)
Epitopes, T-Lymphocyte/immunology , Histocompatibility Antigens Class I/immunology , Open Reading Frames/genetics , Peptides/immunology , Proteome/immunology , SARS-CoV-2/immunology , A549 Cells , Alleles , Amino Acid Sequence , Animals , Antigen Presentation/immunology , COVID-19/immunology , COVID-19/virology , Female , HEK293 Cells , Humans , Kinetics , Male , Mice , Peptides/chemistry , T-Lymphocytes/immunologyABSTRACT
BACKGROUND: Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) reinfection is poorly understood, partly because few studies have systematically applied genomic analysis to distinguish reinfection from persistent RNA detection related to initial infection. We aimed to evaluate the characteristics of SARS-CoV-2 reinfection and persistent RNA detection using independent genomic, clinical, and laboratory assessments. METHODS: All individuals at a large academic medical center who underwent a SARS-CoV-2 nucleic acid amplification test (NAAT) ≥45 days after an initial positive test, with both tests between 14 March and 30 December 2020, were analyzed for potential reinfection. Inclusion criteria required having ≥2 positive NAATs collected ≥45 days apart with a cycle threshold (Ct) value <35 at repeat testing. For each included subject, likelihood of reinfection was assessed by viral genomic analysis of all available specimens with a Ct value <35, structured Ct trajectory criteria, and case-by-case review by infectious diseases physicians. RESULTS: Among 1569 individuals with repeat SARS-CoV-2 testing ≥45 days after an initial positive NAAT, 65 (4%) met cohort inclusion criteria. Viral genomic analysis characterized mutations present and was successful for 14/65 (22%) subjects. Six subjects had genomically supported reinfection, and 8 subjects had genomically supported persistent RNA detection. Compared to viral genomic analysis, clinical and laboratory assessments correctly distinguished reinfection from persistent RNA detection in 12/14 (86%) subjects but missed 2/6 (33%) genomically supported reinfections. CONCLUSIONS: Despite good overall concordance with viral genomic analysis, clinical and Ct value-based assessments failed to identify 33% of genomically supported reinfections. Scaling-up genomic analysis for clinical use would improve detection of SARS-CoV-2 reinfections.
Subject(s)
COVID-19 , Humans , COVID-19/diagnosis , COVID-19 Testing , Reinfection/diagnosis , Retrospective Studies , SARS-CoV-2/genetics , RNAABSTRACT
Although the recent Zika virus (ZIKV) epidemic in the Americas and its link to birth defects have attracted a great deal of attention, much remains unknown about ZIKV disease epidemiology and ZIKV evolution, in part owing to a lack of genomic data. Here we address this gap in knowledge by using multiple sequencing approaches to generate 110 ZIKV genomes from clinical and mosquito samples from 10 countries and territories, greatly expanding the observed viral genetic diversity from this outbreak. We analysed the timing and patterns of introductions into distinct geographic regions; our phylogenetic evidence suggests rapid expansion of the outbreak in Brazil and multiple introductions of outbreak strains into Puerto Rico, Honduras, Colombia, other Caribbean islands, and the continental United States. We find that ZIKV circulated undetected in multiple regions for many months before the first locally transmitted cases were confirmed, highlighting the importance of surveillance of viral infections. We identify mutations with possible functional implications for ZIKV biology and pathogenesis, as well as those that might be relevant to the effectiveness of diagnostic tests.
Subject(s)
Phylogeny , Zika Virus Infection/transmission , Zika Virus Infection/virology , Zika Virus/genetics , Zika Virus/isolation & purification , Animals , Brazil/epidemiology , Colombia/epidemiology , Culicidae/virology , Disease Outbreaks/statistics & numerical data , Genome, Viral/genetics , Geographic Mapping , Honduras/epidemiology , Humans , Metagenome/genetics , Molecular Epidemiology , Mosquito Vectors/virology , Mutation , Public Health Surveillance , Puerto Rico/epidemiology , United States/epidemiology , Zika Virus/classification , Zika Virus/pathogenicity , Zika Virus Infection/diagnosis , Zika Virus Infection/epidemiologySubject(s)
COVID-19/diagnosis , Immunocompromised Host , Liver Transplantation , Reinfection , COVID-19/virology , COVID-19 Nucleic Acid Testing , Humans , Immunosuppressive Agents , Male , Middle Aged , Mycophenolic Acid/therapeutic use , Phylogeny , SARS-CoV-2/genetics , Tacrolimus/therapeutic useABSTRACT
Nigeria and Cameroon reported their first mpox cases in over three decades in 2017 and 2018 respectively. The outbreak in Nigeria is recognised as an ongoing human epidemic. However, owing to sparse surveillance and genomic data, it is not known whether the increase in cases in Cameroon is driven by zoonotic or sustained human transmission. Notably, the frequency of zoonotic transmission remains unknown in both Cameroon and Nigeria. To address these uncertainties, we investigated the zoonotic transmission dynamics of the mpox virus (MPXV) in Cameroon and Nigeria, with a particular focus on the border regions. We show that in these regions mpox cases are still driven by zoonotic transmission of a newly identified Clade IIb.1. We identify two distinct zoonotic lineages that circulate across the Nigeria-Cameroon border, with evidence of recent and historic cross border dissemination. Our findings support that the complex cross-border forest ecosystems likely hosts shared animal populations that drive cross-border viral spread, which is likely where extant Clade IIb originated. We identify that the closest zoonotic outgroup to the human epidemic circulated in southern Nigeria in October 2013. We also show that the zoonotic precursor lineage circulated in an animal population in southern Nigeria for more than 45 years. This supports findings that southern Nigeria was the origin of the human epidemic. Our study highlights the ongoing MPXV zoonotic transmission in Cameroon and Nigeria, underscoring the continuous risk of MPXV (re)emergence.
ABSTRACT
Five years before the 2022-2023 global mpox outbreak Nigeria reported its first cases in nearly 40 years, with the ongoing epidemic since driven by sustained human-to-human transmission. However, limited genomic data has left questions about the timing and origin of the mpox virus' (MPXV) emergence. Here we generated 112 MPXV genomes from Nigeria from 2021-2023. We identify the closest zoonotic outgroup to the human epidemic in southern Nigeria, and estimate that the lineage transmitting from human-to-human emerged around July 2014, circulating cryptically until detected in September 2017. The epidemic originated in Southern Nigeria, particularly Rivers State, which also acted as a persistent and dominant source of viral dissemination to other states. We show that APOBEC3 activity increased MPXV's evolutionary rate twenty-fold during human-to-human transmission. We also show how Delphy, a tool for near-real-time Bayesian phylogenetics, can aid rapid outbreak analytics. Our study sheds light on MPXV's establishment in West Africa before the 2022-2023 global outbreak and highlights the need for improved pathogen surveillance and response.
ABSTRACT
BACKGROUND: Universities are vulnerable to infectious disease outbreaks, making them ideal environments to study transmission dynamics and evaluate mitigation and surveillance measures. Here, we analyze multimodal COVID-19-associated data collected during the 2020-2021 academic year at Colorado Mesa University and introduce a SARS-CoV-2 surveillance and response framework. METHODS: We analyzed epidemiological and sociobehavioral data (demographics, contact tracing, and WiFi-based co-location data) alongside pathogen surveillance data (wastewater and diagnostic testing, and viral genomic sequencing of wastewater and clinical specimens) to characterize outbreak dynamics and inform policy. We applied relative risk, multiple linear regression, and social network assortativity to identify attributes or behaviors associated with contracting SARS-CoV-2. To characterize SARS-CoV-2 transmission, we used viral sequencing, phylogenomic tools, and functional assays. FINDINGS: Athletes, particularly those on high-contact teams, had the highest risk of testing positive. On average, individuals who tested positive had more contacts and longer interaction durations than individuals who never tested positive. The distribution of contacts per individual was overdispersed, although not as overdispersed as the distribution of phylogenomic descendants. Corroboration via technical replicates was essential for identification of wastewater mutations. CONCLUSIONS: Based on our findings, we formulate a framework that combines tools into an integrated disease surveillance program that can be implemented in other congregate settings with limited resources. FUNDING: This work was supported by the National Science Foundation, the Hertz Foundation, the National Institutes of Health, the Centers for Disease Control and Prevention, the Massachusetts Consortium on Pathogen Readiness, the Howard Hughes Medical Institute, the Flu Lab, and the Audacious Project.
Subject(s)
COVID-19 , SARS-CoV-2 , United States , Humans , SARS-CoV-2/genetics , COVID-19/epidemiology , Disease Outbreaks , Universities , Contact TracingABSTRACT
The SARS-CoV-2 Delta variant rose to dominance in mid-2021, likely propelled by an estimated 40%-80% increased transmissibility over Alpha. To investigate if this ostensible difference in transmissibility is uniform across populations, we partner with public health programs from all six states in New England in the United States. We compare logistic growth rates during each variant's respective emergence period, finding that Delta emerged 1.37-2.63 times faster than Alpha (range across states). We compute variant-specific effective reproductive numbers, estimating that Delta is 63%-167% more transmissible than Alpha (range across states). Finally, we estimate that Delta infections generate on average 6.2 (95% CI 3.1-10.9) times more viral RNA copies per milliliter than Alpha infections during their respective emergence. Overall, our evidence suggests that Delta's enhanced transmissibility can be attributed to its innate ability to increase infectiousness, but its epidemiological dynamics may vary depending on underlying population attributes and sequencing data availability.
Subject(s)
COVID-19 , SARS-CoV-2 , COVID-19/epidemiology , Humans , New England/epidemiology , Public Health , SARS-CoV-2/geneticsABSTRACT
Data analysis often entails a multitude of heterogeneous steps, from the application of various command line tools to the usage of scripting languages like R or Python for the generation of plots and tables. It is widely recognized that data analyses should ideally be conducted in a reproducible way. Reproducibility enables technical validation and regeneration of results on the original or even new data. However, reproducibility alone is by no means sufficient to deliver an analysis that is of lasting impact (i.e., sustainable) for the field, or even just one research group. We postulate that it is equally important to ensure adaptability and transparency. The former describes the ability to modify the analysis to answer extended or slightly different research questions. The latter describes the ability to understand the analysis in order to judge whether it is not only technically, but methodologically valid. Here, we analyze the properties needed for a data analysis to become reproducible, adaptable, and transparent. We show how the popular workflow management system Snakemake can be used to guarantee this, and how it enables an ergonomic, combined, unified representation of all steps involved in data analysis, ranging from raw data processing, to quality control and fine-grained, interactive exploration and plotting of final results.
Subject(s)
Data Analysis , Software , Reproducibility of Results , WorkflowABSTRACT
The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) Delta variant quickly rose to dominance in mid-2021, displacing other variants, including Alpha. Studies using data from the United Kingdom and India estimated that Delta was 40-80% more transmissible than Alpha, allowing Delta to become the globally dominant variant. However, it was unclear if the ostensible difference in relative transmissibility was due mostly to innate properties of Delta's infectiousness or differences in the study populations. To investigate, we formed a partnership with SARS-CoV-2 genomic surveillance programs from all six New England US states. By comparing logistic growth rates, we found that Delta emerged 37-163% faster than Alpha in early 2021 (37% Massachusetts, 75% New Hampshire, 95% Maine, 98% Rhode Island, 151% Connecticut, and 163% Vermont). We next computed variant-specific effective reproductive numbers and estimated that Delta was 58-120% more transmissible than Alpha across New England (58% New Hampshire, 68% Massachusetts, 76% Connecticut, 85% Rhode Island, 98% Maine, and 120% Vermont). Finally, using RT-PCR data, we estimated that Delta infections generate on average â¼6 times more viral RNA copies per mL than Alpha infections. Overall, our evidence indicates that Delta's enhanced transmissibility could be attributed to its innate ability to increase infectiousness, but its epidemiological dynamics may vary depending on the underlying immunity and behavior of distinct populations.
ABSTRACT
Multiple summer events, including large indoor gatherings, in Provincetown, Massachusetts (MA), in July 2021 contributed to an outbreak of over one thousand COVID-19 cases among residents and visitors. Most cases were fully vaccinated, many of whom were also symptomatic, prompting a comprehensive public health response, motivating changes to national masking recommendations, and raising questions about infection and transmission among vaccinated individuals. To characterize the outbreak and the viral population underlying it, we combined genomic and epidemiological data from 467 individuals, including 40% of known outbreak-associated cases. The Delta variant accounted for 99% of sequenced outbreak-associated cases. Phylogenetic analysis suggests over 40 sources of Delta in the dataset, with one responsible for a single cluster containing 83% of outbreak-associated genomes. This cluster was likely not the result of extensive spread at a single site, but rather transmission from a common source across multiple settings over a short time. Genomic and epidemiological data combined provide strong support for 25 transmission events from, including many between, fully vaccinated individuals; genomic data alone provides evidence for an additional 64. Together, genomic epidemiology provides a high-resolution picture of the Provincetown outbreak, revealing multiple cases of transmission of Delta from fully vaccinated individuals. However, despite its magnitude, the outbreak was restricted in its onward impact in MA and the US, likely due to high vaccination rates and a robust public health response.
ABSTRACT
Analysis of 772 complete severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) genomes from early in the Boston-area epidemic revealed numerous introductions of the virus, a small number of which led to most cases. The data revealed two superspreading events. One, in a skilled nursing facility, led to rapid transmission and significant mortality in this vulnerable population but little broader spread, whereas other introductions into the facility had little effect. The second, at an international business conference, produced sustained community transmission and was exported, resulting in extensive regional, national, and international spread. The two events also differed substantially in the genetic variation they generated, suggesting varying transmission dynamics in superspreading events. Our results show how genomic epidemiology can help to understand the link between individual clusters and wider community spread.
Subject(s)
COVID-19/epidemiology , Genome, Viral , Phylogeny , SARS-CoV-2/genetics , Boston/epidemiology , COVID-19/transmission , Disease Outbreaks , Epidemiological Monitoring , HumansABSTRACT
T cell-mediated immunity may play a critical role in controlling and establishing protective immunity against SARS-CoV-2 infection; yet the repertoire of viral epitopes responsible for T cell response activation remains mostly unknown. Identification of viral peptides presented on class I human leukocyte antigen (HLA-I) can reveal epitopes for recognition by cytotoxic T cells and potential incorporation into vaccines. Here, we report the first HLA-I immunopeptidome of SARS-CoV-2 in two human cell lines at different times post-infection using mass spectrometry. We found HLA-I peptides derived not only from canonical ORFs, but also from internal out-of-frame ORFs in Spike and Nucleoprotein not captured by current vaccines. Proteomics analyses of infected cells revealed that SARS-CoV-2 may interfere with antigen processing and immune signaling pathways. Based on the endogenously processed and presented viral peptides that we identified, we estimate that a pool of 24 peptides would provide one or more peptides for presentation by at least one HLA allele in 99% of the human population. These biological insights and the list of naturally presented SARS-CoV-2 peptides will facilitate data-driven selection of peptides for immune monitoring and vaccine development.
ABSTRACT
SARS-CoV-2 has caused a severe, ongoing outbreak of COVID-19 in Massachusetts with 111,070 confirmed cases and 8,433 deaths as of August 1, 2020. To investigate the introduction, spread, and epidemiology of COVID-19 in the Boston area, we sequenced and analyzed 772 complete SARS-CoV-2 genomes from the region, including nearly all confirmed cases within the first week of the epidemic and hundreds of cases from major outbreaks at a conference, a nursing facility, and among homeless shelter guests and staff. The data reveal over 80 introductions into the Boston area, predominantly from elsewhere in the United States and Europe. We studied two superspreading events covered by the data, events that led to very different outcomes because of the timing and populations involved. One produced rapid spread in a vulnerable population but little onward transmission, while the other was a major contributor to sustained community transmission, including outbreaks in homeless populations, and was exported to several other domestic and international sites. The same two events differed significantly in the number of new mutations seen, raising the possibility that SARS-CoV-2 superspreading might encompass disparate transmission dynamics. Our results highlight the failure of measures to prevent importation into MA early in the outbreak, underscore the role of superspreading in amplifying an outbreak in a major urban area, and lay a foundation for contact tracing informed by genetic data.