ABSTRACT
The highly transmissible B.1.1.7 variant of SARS-CoV-2, first identified in the United Kingdom, has gained a foothold across the world. Using S gene target failure (SGTF) and SARS-CoV-2 genomic sequencing, we investigated the prevalence and dynamics of this variant in the United States (US), tracking it back to its early emergence. We found that, while the fraction of B.1.1.7 varied by state, the variant increased at a logistic rate with a roughly weekly doubling rate and an increased transmission of 40%-50%. We revealed several independent introductions of B.1.1.7 into the US as early as late November 2020, with community transmission spreading it to most states within months. We show that the US is on a similar trajectory as other countries where B.1.1.7 became dominant, requiring immediate and decisive action to minimize COVID-19 morbidity and mortality.
Subject(s)
COVID-19 , Models, Biological , SARS-CoV-2 , COVID-19/genetics , COVID-19/mortality , COVID-19/transmission , Female , Humans , Male , SARS-CoV-2/genetics , SARS-CoV-2/metabolism , SARS-CoV-2/pathogenicity , United States/epidemiologyABSTRACT
The emergence and spread of SARS-CoV-2 lineage B.1.1.7, first detected in the United Kingdom, has become a global public health concern because of its increased transmissibility. Over 2,500 COVID-19 cases associated with this variant have been detected in the United States (US) since December 2020, but the extent of establishment is relatively unknown. Using travel, genomic, and diagnostic data, we highlight that the primary ports of entry for B.1.1.7 in the US were in New York, California, and Florida. Furthermore, we found evidence for many independent B.1.1.7 establishments starting in early December 2020, followed by interstate spread by the end of the month. Finally, we project that B.1.1.7 will be the dominant lineage in many states by mid- to late March. Thus, genomic surveillance for B.1.1.7 and other variants urgently needs to be enhanced to better inform the public health response.
Subject(s)
COVID-19 Testing , COVID-19 , Models, Biological , SARS-CoV-2 , COVID-19/genetics , COVID-19/mortality , COVID-19/transmission , Female , Humans , Male , SARS-CoV-2/genetics , SARS-CoV-2/metabolism , SARS-CoV-2/pathogenicity , United States/epidemiologyABSTRACT
CDC continues to track the evolution of SARS-CoV-2, including the Omicron variant and its descendants, using national genomic surveillance. This report summarizes U.S. trends in variant proportion estimates during May 2023-September 2024, a period when SARS-CoV-2 lineages primarily comprised descendants of Omicron variants XBB and JN.1. During summer and fall 2023, multiple descendants of XBB with immune escape substitutions emerged and reached >10% prevalence, including EG.5-like lineages by June 24, FL.1.5.1-like lineages by August 5, HV.1 lineage by September 30, and HK.3-like lineages by November 11. In winter 2023, the JN.1 variant emerged in the United States and rapidly attained predominance nationwide, representing a substantial genetic shift (>30 spike protein amino acid differences) from XBB lineages. Descendants of JN.1 subsequently circulated and reached >10% prevalence, including KQ.1-like and KP.2-like lineages by April 13, KP.3 and LB.1-like lineages by May 25, and KP.3.1.1 by July 20. Surges in COVID-19 cases occurred in winter 2024 during the shift to JN.1 predominance, as well as in summer 2023 and 2024 during circulation of multiple XBB and JN.1 descendants, respectively. The ongoing evolution of the Omicron variant highlights the importance of continued genomic surveillance to guide medical countermeasure development, including the selection of antigens for updated COVID-19 vaccines.
Subject(s)
COVID-19 , Genome, Viral , Genomics , SARS-CoV-2 , Humans , SARS-CoV-2/genetics , SARS-CoV-2/isolation & purification , United States/epidemiology , COVID-19/epidemiologyABSTRACT
We enrolled arriving international air travelers in a severe acute respiratory syndrome coronavirus 2 genomic surveillance program. We used molecular testing of pooled nasal swabs and sequenced positive samples for sublineage. Traveler-based surveillance provided early-warning variant detection, reporting the first US Omicron BA.2 and BA.3 in North America.
Subject(s)
COVID-19 , SARS-CoV-2 , Humans , SARS-CoV-2/genetics , Airports , COVID-19/diagnosis , GenomicsABSTRACT
Monitoring emerging SARS-CoV-2 lineages and their epidemiologic characteristics helps to inform public health decisions regarding vaccine policy, the use of therapeutics, and health care capacity. When the SARS-CoV-2 Alpha variant emerged in late 2020, a spike gene (S-gene) deletion (Δ69-70) in the N-terminal region, which might compensate for immune escape mutations that impair infectivity (1), resulted in reduced or failed S-gene target amplification in certain multitarget reverse transcription-polymerase chain reaction (RT-PCR) assays, a pattern referred to as S-gene target failure (SGTF) (2). The predominant U.S. SARS-CoV-2 lineages have generally alternated between SGTF and S-gene target presence (SGTP), which alongside genomic sequencing, has facilitated early monitoring of emerging variants. During a period when Omicron BA.5-related sublineages (which exhibit SGTF) predominated, an XBB.1.5 sublineage with SGTP has rapidly expanded in the northeastern United States and other regions.
Subject(s)
COVID-19 , Public Health , United States/epidemiology , Humans , SARS-CoV-2/genetics , COVID-19/epidemiology , Mutation , COVID-19 TestingABSTRACT
Early detection of emerging SARS-CoV-2 variants is critical to guiding rapid risk assessments, providing clear and timely communication messages, and coordinating public health action. CDC identifies and monitors novel SARS-CoV-2 variants through diverse surveillance approaches, including genomic, wastewater, traveler-based, and digital public health surveillance (e.g., global data repositories, news, and social media). The SARS-CoV-2 variant BA.2.86 was first sequenced in Israel and reported on August 13, 2023. The first U.S. COVID-19 case caused by this variant was reported on August 17, 2023, after a patient received testing for SARS-CoV-2 at a health care facility on August 3. In the following month, eight additional U.S. states detected BA.2.86 across various surveillance systems, including specimens from health care settings, wastewater surveillance, and traveler-based genomic surveillance. As of October 23, 2023, sequences have been reported from at least 32 countries. Continued variant tracking and further evidence are needed to evaluate the full public health impact of BA.2.86. Timely genomic sequence submissions to global public databases aided early detection of BA.2.86 despite the decline in the number of specimens being sequenced during the past year. This report describes how multicomponent surveillance and genomic sequencing were used in real time to track the emergence and transmission of the BA.2.86 variant. This surveillance approach provides valuable information regarding implementing and sustaining comprehensive surveillance not only for novel SARS-CoV-2 variants but also for future pathogen threats.
Subject(s)
COVID-19 , Humans , SARS-CoV-2/genetics , Wastewater , Wastewater-Based Epidemiological MonitoringABSTRACT
CDC has used national genomic surveillance since December 2020 to monitor SARS-CoV-2 variants that have emerged throughout the COVID-19 pandemic, including the Omicron variant. This report summarizes U.S. trends in variant proportions from national genomic surveillance during January 2022-May 2023. During this period, the Omicron variant remained predominant, with various descendant lineages reaching national predominance (>50% prevalence). During the first half of 2022, BA.1.1 reached predominance by the week ending January 8, 2022, followed by BA.2 (March 26), BA.2.12.1 (May 14), and BA.5 (July 2); the predominance of each variant coincided with surges in COVID-19 cases. The latter half of 2022 was characterized by the circulation of sublineages of BA.2, BA.4, and BA.5 (e.g., BQ.1 and BQ.1.1), some of which independently acquired similar spike protein substitutions associated with immune evasion. By the end of January 2023, XBB.1.5 became predominant. As of May 13, 2023, the most common circulating lineages were XBB.1.5 (61.5%), XBB.1.9.1 (10.0%), and XBB.1.16 (9.4%); XBB.1.16 and XBB.1.16.1 (2.4%), containing the K478R substitution, and XBB.2.3 (3.2%), containing the P521S substitution, had the fastest doubling times at that point. Analytic methods for estimating variant proportions have been updated as the availability of sequencing specimens has declined. The continued evolution of Omicron lineages highlights the importance of genomic surveillance to monitor emerging variants and help guide vaccine development and use of therapeutics.
Subject(s)
COVID-19 , SARS-CoV-2 , Humans , SARS-CoV-2/genetics , Pandemics , COVID-19/epidemiology , GenomicsABSTRACT
Rapid advances in DNA sequencing technology ("next-generation sequencing") have inspired optimism about the potential of human genomics for "precision medicine." Meanwhile, pathogen genomics is already delivering "precision public health" through more effective investigations of outbreaks of foodborne illnesses, better-targeted tuberculosis control, and more timely and granular influenza surveillance to inform the selection of vaccine strains. In this article, we describe how public health agencies have been adopting pathogen genomics to improve their effectiveness in almost all domains of infectious disease. This momentum is likely to continue, given the ongoing development in sequencing and sequencing-related technologies.
Subject(s)
Disease Outbreaks , Foodborne Diseases/epidemiology , Genomics , High-Throughput Nucleotide Sequencing , Influenza, Human/epidemiology , Public Health , Tuberculosis/epidemiology , Animals , Bacteria/genetics , Foodborne Diseases/diagnosis , Foodborne Diseases/microbiology , Foodborne Diseases/parasitology , Humans , Influenza, Human/diagnosis , Influenza, Human/microbiology , Metagenomics , Parasites/genetics , Tuberculosis/diagnosis , Viruses/geneticsABSTRACT
Genomic surveillance is a critical tool for tracking emerging variants of SARS-CoV-2 (the virus that causes COVID-19), which can exhibit characteristics that potentially affect public health and clinical interventions, including increased transmissibility, illness severity, and capacity for immune escape. During June 2021-January 2022, CDC expanded genomic surveillance data sources to incorporate sequence data from public repositories to produce weighted estimates of variant proportions at the jurisdiction level and refined analytic methods to enhance the timeliness and accuracy of national and regional variant proportion estimates. These changes also allowed for more comprehensive variant proportion estimation at the jurisdictional level (i.e., U.S. state, district, territory, and freely associated state). The data in this report are a summary of findings of recent proportions of circulating variants that are updated weekly on CDC's COVID Data Tracker website to enable timely public health action. The SARS-CoV-2 Delta (B.1.617.2 and AY sublineages) variant rose from 1% to >50% of viral lineages circulating nationally during 8 weeks, from May 1-June 26, 2021. Delta-associated infections remained predominant until being rapidly overtaken by infections associated with the Omicron (B.1.1.529 and BA sublineages) variant in December 2021, when Omicron increased from 1% to >50% of circulating viral lineages during a 2-week period. As of the week ending January 22, 2022, Omicron was estimated to account for 99.2% (95% CI = 99.0%-99.5%) of SARS-CoV-2 infections nationwide, and Delta for 0.7% (95% CI = 0.5%-1.0%). The dynamic landscape of SARS-CoV-2 variants in 2021, including Delta- and Omicron-driven resurgences of SARS-CoV-2 transmission across the United States, underscores the importance of robust genomic surveillance efforts to inform public health planning and practice.
Subject(s)
COVID-19/epidemiology , COVID-19/virology , SARS-CoV-2/genetics , Centers for Disease Control and Prevention, U.S. , Genomics , Humans , Prevalence , Public Health Surveillance/methods , United States/epidemiologySubject(s)
COVID-19/epidemiology , COVID-19/virology , Evolution, Molecular , Genomics/methods , Genomics/trends , Mutation , SARS-CoV-2/genetics , Animals , Automation/methods , Basic Reproduction Number , COVID-19/immunology , COVID-19/transmission , COVID-19 Vaccines/immunology , Genome, Viral/genetics , Humans , Mink/virology , Pandemics/statistics & numerical data , Phylogeny , Public Health/methods , Public Health/trends , SARS-CoV-2/immunology , SARS-CoV-2/isolation & purification , SARS-CoV-2/pathogenicity , Social Media , UncertaintyABSTRACT
Coronavirus disease has disproportionately affected persons in congregate settings and high-density workplaces. To determine more about the transmission patterns of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in these settings, we performed whole-genome sequencing and phylogenetic analysis on 319 (14.4%) samples from 2,222 SARS-CoV-2-positive persons associated with 8 outbreaks in Minnesota, USA, during March-June 2020. Sequencing indicated that virus spread in 3 long-term care facilities and 2 correctional facilities was associated with a single genetic sequence and that in a fourth long-term care facility, outbreak cases were associated with 2 distinct sequences. In contrast, cases associated with outbreaks in 2 meat-processing plants were associated with multiple SARS-CoV-2 sequences. These results suggest that a single introduction of SARS-CoV-2 into a facility can result in a widespread outbreak. Early identification and cohorting (segregating) of virus-positive persons in these settings, along with continued vigilance with infection prevention and control measures, is imperative.
Subject(s)
COVID-19 , SARS-CoV-2 , Disease Outbreaks , Humans , Minnesota/epidemiology , PhylogenyABSTRACT
On December 14, 2020, the United Kingdom reported a SARS-CoV-2 variant of concern (VOC), lineage B.1.1.7, also referred to as VOC 202012/01 or 20I/501Y.V1.* The B.1.1.7 variant is estimated to have emerged in September 2020 and has quickly become the dominant circulating SARS-CoV-2 variant in England (1). B.1.1.7 has been detected in over 30 countries, including the United States. As of January 13, 2021, approximately 76 cases of B.1.1.7 have been detected in 12 U.S. states. Multiple lines of evidence indicate that B.1.1.7 is more efficiently transmitted than are other SARS-CoV-2 variants (1-3). The modeled trajectory of this variant in the U.S. exhibits rapid growth in early 2021, becoming the predominant variant in March. Increased SARS-CoV-2 transmission might threaten strained health care resources, require extended and more rigorous implementation of public health strategies (4), and increase the percentage of population immunity required for pandemic control. Taking measures to reduce transmission now can lessen the potential impact of B.1.1.7 and allow critical time to increase vaccination coverage. Collectively, enhanced genomic surveillance combined with continued compliance with effective public health measures, including vaccination, physical distancing, use of masks, hand hygiene, and isolation and quarantine, will be essential to limiting the spread of SARS-CoV-2, the virus that causes coronavirus disease 2019 (COVID-19). Strategic testing of persons without symptoms but at higher risk of infection, such as those exposed to SARS-CoV-2 or who have frequent unavoidable contact with the public, provides another opportunity to limit ongoing spread.
Subject(s)
COVID-19/epidemiology , COVID-19/virology , SARS-CoV-2/genetics , COVID-19/transmission , Genome, Viral , Humans , Mutation , United States/epidemiologyABSTRACT
SARS-CoV-2, the virus that causes COVID-19, is constantly mutating, leading to new variants (1). Variants have the potential to affect transmission, disease severity, diagnostics, therapeutics, and natural and vaccine-induced immunity. In November 2020, CDC established national surveillance for SARS-CoV-2 variants using genomic sequencing. As of May 6, 2021, sequences from 177,044 SARS-CoV-2-positive specimens collected during December 20, 2020-May 6, 2021, from 55 U.S. jurisdictions had been generated by or reported to CDC. These included 3,275 sequences for the 2-week period ending January 2, 2021, compared with 25,000 sequences for the 2-week period ending April 24, 2021 (0.1% and 3.1% of reported positive SARS-CoV-2 tests, respectively). Because sequences might be generated by multiple laboratories and sequence availability varies both geographically and over time, CDC developed statistical weighting and variance estimation methods to generate population-based estimates of the proportions of identified variants among SARS-CoV-2 infections circulating nationwide and in each of the 10 U.S. Department of Health and Human Services (HHS) geographic regions.* During the 2-week period ending April 24, 2021, the B.1.1.7 and P.1 variants represented an estimated 66.0% and 5.0% of U.S. SARS-CoV-2 infections, respectively, demonstrating the rise to predominance of the B.1.1.7 variant of concern (VOC) and emergence of the P.1 VOC in the United States. Using SARS-CoV-2 genomic surveillance methods to analyze surveillance data produces timely population-based estimates of the proportions of variants circulating nationally and regionally. Surveillance findings demonstrate the potential for new variants to emerge and become predominant, and the importance of robust genomic surveillance. Along with efforts to characterize the clinical and public health impact of SARS-CoV-2 variants, surveillance can help guide interventions to control the COVID-19 pandemic in the United States.
Subject(s)
COVID-19/virology , SARS-CoV-2/genetics , COVID-19/epidemiology , Epidemiological Monitoring , Humans , SARS-CoV-2/isolation & purification , United States/epidemiologyABSTRACT
BACKGROUND: Whole-genome sequencing (WGS) is an emerging and powerful technique by which to perform epidemiological studies in outbreak situations. METHODS: WGS was used to identify and evaluate an outbreak of OXA-232-expressing carbapenem-resistant Klebsiella pneumoniae (CRKP) transmitted to 16 patients over the course of 40 weeks via endoscopic retrograde cholangiopancreatography procedures at a single institution. WGS was performed on 32 OXA-232 CRKP isolates (1-7 per patient) and single-nucleotide variants (SNVs) were analyzed, with reference to the index patient's isolate. RESULTS: Interhost genetic diversity of isolates was between 0 and 15 SNVs during the outbreak; molecular clock calculations estimated 12.31 substitutions per genome per year (95% credibility interval, 7.81-17.05). Both intra- and interpatient diversification at the plasmid and transposon level was observed, significantly impacting the antibiogram of outbreak isolates. The majority of isolates evaluated (n = 27) harbored a blaCTX-M-15 gene, but some (n = 5) lacked the transposon carrying this gene, which resulted in susceptibility to aztreonam and third- and fourth-generation cephalosporins. Similarly, an isolate from a colonized patient lacked the transposon carrying rmtF and aac(6')lb genes, resulting in susceptibility to aminoglycosides. CONCLUSIONS: This study broadens the understanding of how bacteria diversify at the genomic level over the course of a defined outbreak and provides reference for future outbreak investigations.
Subject(s)
Carbapenems/pharmacology , Cholangiopancreatography, Endoscopic Retrograde/adverse effects , Klebsiella Infections/epidemiology , Klebsiella Infections/transmission , Klebsiella pneumoniae/drug effects , Klebsiella pneumoniae/genetics , beta-Lactamases/genetics , Cross Infection , Disease Outbreaks , Enzyme Activation , Genetic Variation , Genome, Bacterial , Humans , Klebsiella pneumoniae/classification , Phylogeny , Plasmids/genetics , Whole Genome Sequencing , beta-Lactamases/metabolismABSTRACT
Advances in laboratory and information technologies are transforming public health microbiology. High-throughput genome sequencing and bioinformatics are enhancing our ability to investigate and control outbreaks, detect emerging infectious diseases, develop vaccines, and combat antimicrobial resistance, all with increased accuracy, timeliness, and efficiency. The Advanced Molecular Detection (AMD) initiative has allowed the Centers for Disease Control and Prevention (CDC) to provide leadership and coordination in integrating new technologies into routine practice throughout the U.S. public health laboratory system. Collaboration and partnerships are the key to navigating this transition and to leveraging the next generation of methods and tools most effectively for public health.
Subject(s)
Microbiological Techniques/methods , Molecular Diagnostic Techniques/methods , Public Health Administration/methods , Humans , United StatesABSTRACT
Chlamydia psittaci is an obligate intracellular bacterium that can cause significant disease among a broad range of hosts. In humans, this organism may cause psittacosis, a respiratory disease that can spread to involve multiple organs, and in rare untreated cases may be fatal. There are ten known genotypes based on sequencing the major outer-membrane protein gene, ompA, of C. psittaci. Each genotype has overlapping host preferences and virulence characteristics. Recent studies have compared C. psittaci among other members of the Chlamydiaceae family and showed that this species frequently switches hosts and has undergone multiple genomic rearrangements. In this study, we sequenced five genomes of C. psittaci strains representing four genotypes, A, B, D and E. Due to the known association of the type III secretion system (T3SS) and polymorphic outer-membrane proteins (Pmps) with host tropism and virulence potential, we performed a comparative analysis of these elements among these five strains along with a representative genome from each of the remaining six genotypes previously sequenced. We found significant genetic variation in the Pmps and tbl3SS genes that may partially explain differences noted in C. psittaci host infection and disease.
Subject(s)
Bacterial Outer Membrane Proteins/genetics , Chlamydophila psittaci/genetics , Genetic Variation , Genome, Bacterial , Type III Secretion Systems/genetics , Computational Biology , DNA, Bacterial/chemistry , DNA, Bacterial/genetics , Genotype , Molecular Sequence Data , Sequence Analysis, DNAABSTRACT
Exserohilum rostratum was the cause of most cases of fungal meningitis and other infections associated with the injection of contaminated methylprednisolone acetate produced by the New England Compounding Center (NECC). Until this outbreak, very few human cases of Exserohilum infection had been reported, and very little was known about this dematiaceous fungus, which usually infects plants. Here, we report using whole-genome sequencing (WGS) for the detection of single nucleotide polymorphisms (SNPs) and phylogenetic analysis to investigate the molecular origin of the outbreak using 22 isolates of E. rostratum retrieved from 19 case patients with meningitis or epidural/spinal abscesses, 6 isolates from contaminated NECC vials, and 7 isolates unrelated to the outbreak. Our analysis indicates that all 28 isolates associated with the outbreak had nearly identical genomes of 33.8 Mb. A total of 8 SNPs were detected among the outbreak genomes, with no more than 2 SNPs separating any 2 of the 28 genomes. The outbreak genomes were separated from the next most closely related control strain by â¼136,000 SNPs. We also observed significant genomic variability among strains unrelated to the outbreak, which may suggest the possibility of cryptic speciation in E. rostratum.
Subject(s)
Ascomycota/classification , Ascomycota/genetics , Disease Outbreaks , Genome, Fungal , Meningitis, Fungal/epidemiology , Mycoses/epidemiology , Ascomycota/isolation & purification , Cluster Analysis , Humans , Meningitis, Fungal/microbiology , Molecular Epidemiology , Molecular Sequence Data , Molecular Typing , Mycological Typing Techniques , Mycoses/microbiology , New England , Phylogeny , Polymorphism, Single Nucleotide , Sequence Analysis, DNAABSTRACT
IMPORTANCE: Carbapenem-resistant Enterobacteriaceae (CRE) producing the New Delhi metallo-ß-lactamase (NDM) are rare in the United States, but have the potential to add to the increasing CRE burden. Previous NDM-producing CRE clusters have been attributed to person-to-person transmission in health care facilities. OBJECTIVE: To identify a source for, and interrupt transmission of, NDM-producing CRE in a northeastern Illinois hospital. DESIGN, SETTING, AND PARTICIPANTS: Outbreak investigation among 39 case patients at a tertiary care hospital in northeastern Illinois, including a case-control study, infection control assessment, and collection of environmental and device cultures; patient and environmental isolate relatedness was evaluated with pulsed-field gel electrophoresis (PFGE). Following identification of a likely source, targeted patient notification and CRE screening cultures were performed. MAIN OUTCOMES AND MEASURES: Association between exposure and acquisition of NDM-producing CRE; results of environmental cultures and organism typing. RESULTS: In total, 39 case patients were identified from January 2013 through December 2013, 35 with duodenoscope exposure in 1 hospital. No lapses in duodenoscope reprocessing were identified; however, NDM-producing Escherichia coli was recovered from a reprocessed duodenoscope and shared more than 92% similarity to all case patient isolates by PFGE. Based on the case-control study, case patients had significantly higher odds of being exposed to a duodenoscope (odds ratio [OR], 78 [95% CI, 6.0-1008], P < .001). After the hospital changed its reprocessing procedure from automated high-level disinfection with ortho-phthalaldehyde to gas sterilization with ethylene oxide, no additional case patients were identified. CONCLUSIONS AND RELEVANCE: In this investigation, exposure to duodenoscopes with bacterial contamination was associated with apparent transmission of NDM-producing E coli among patients at 1 hospital. Bacterial contamination of duodenoscopes appeared to persist despite the absence of recognized reprocessing lapses. Facilities should be aware of the potential for transmission of bacteria including antimicrobial-resistant organisms via this route and should conduct regular reviews of their duodenoscope reprocessing procedures to ensure optimal manual cleaning and disinfection.
Subject(s)
Carbapenems/pharmacology , Disinfection/methods , Duodenoscopes/microbiology , Enterobacteriaceae Infections/etiology , Equipment Contamination , Escherichia coli , Adult , Aged , Aged, 80 and over , Case-Control Studies , Cohort Studies , Cross Infection/epidemiology , Disease Outbreaks , Drug Resistance, Bacterial , Enterobacteriaceae Infections/epidemiology , Escherichia coli/enzymology , Escherichia coli/isolation & purification , Female , Hospitals , Humans , Illinois/epidemiology , Male , Middle Aged , beta-LactamasesABSTRACT
As public health laboratories expand their genomic sequencing and bioinformatics capacity for the surveillance of different pathogens, labs must carry out robust validation, training, and optimization of wet- and dry-lab procedures. Achieving these goals for algorithms, pipelines and instruments often requires that lower quality datasets be made available for analysis and comparison alongside those of higher quality. This range of data quality in reference sets can complicate the sharing of sub-optimal datasets that are vital for the community and for the reproducibility of assays. Sharing of useful, but sub-optimal datasets requires careful annotation and documentation of known issues to enable appropriate interpretation, avoid being mistaken for better quality information, and for these data (and their derivatives) to be easily identifiable in repositories. Unfortunately, there are currently no standardized attributes or mechanisms for tagging poor-quality datasets, or datasets generated for a specific purpose, to maximize their utility, searchability, accessibility and reuse. The Public Health Alliance for Genomic Epidemiology (PHA4GE) is an international community of scientists from public health, industry and academia focused on improving the reproducibility, interoperability, portability, and openness of public health bioinformatic software, skills, tools and data. To address the challenges of sharing lower quality datasets, PHA4GE has developed a set of standardized contextual data tags, namely fields and terms, that can be included in public repository submissions as a means of flagging pathogen sequence data with known quality issues, increasing their discoverability. The contextual data tags were developed through consultations with the community including input from the International Nucleotide Sequence Data Collaboration (INSDC), and have been standardized using ontologies - community-based resources for defining the tag properties and the relationships between them. The standardized tags are agnostic to the organism and the sequencing technique used and thus can be applied to data generated from any pathogen using an array of sequencing techniques. The tags can also be applied to synthetic (lab created) data. The list of standardized tags is maintained by PHA4GE and can be found at https://github.com/pha4ge/contextual_data_QC_tags. Definitions, ontology IDs, examples of use, as well as a JSON representation, are provided. The PHA4GE QC tags were tested, and are now implemented, by the FDA's GenomeTrakr laboratory network as part of its routine submission process for SARS-CoV-2 wastewater surveillance. We hope that these simple, standardized tags will help improve communication regarding quality control in public repositories, in addition to making datasets of variable quality more easily identifiable. Suggestions for additional tags can be submitted to PHA4GE via the New Term Request Form in the GitHub repository. By providing a mechanism for feedback and suggestions, we also expect that the tags will evolve with the needs of the community.