ABSTRACT
BACKGROUND: The origin of novel SARS-CoV-2 spike sequences found in wastewater, without corresponding detection in clinical specimens, remains unclear. We sought to determine the origin of one such cryptic wastewater lineage by tracking and characterising its persistence and genomic evolution over time. METHODS: We first detected a cryptic lineage, WI-CL-001, in municipal wastewater in Wisconsin, USA, in January, 2022. To determine the source of WI-CL-001, we systematically sampled wastewater from targeted sub-sewershed lines and maintenance holes using compositing autosamplers. Viral concentrations in wastewater samples over time were measured by RT digital PCR. In addition to using metagenomic 12s rRNA sequencing to determine the virus's host species, we also sequenced SARS-CoV-2 spike receptor binding domains, and, where possible, whole viral genomes to identify and characterise the evolution of this lineage. FINDINGS: We traced WI-CL-001 to its source at a single commercial building. There we detected the cryptic lineage at concentrations as high as 2·7 × 109 genome copies per L. The majority of 12s rRNA sequences detected in wastewater leaving the identified source building were human. Additionally, we generated over 100 viral receptor binding domain and whole-genome sequences from wastewater samples containing the cryptic lineage collected over the 13 consecutive months this virus was detectable (January, 2022, to January, 2023). These sequences contained a combination of fixed nucleotide substitutions characteristic of Pango lineage B.1.234, which circulated in humans in Wisconsin at low levels from October, 2020, to February, 2021. Despite this, mutations in the spike gene and elsewhere resembled those subsequently found in omicron variants. INTERPRETATION: We propose that prolonged detection of WI-CL-001 in wastewater indicates persistent shedding of SARS-CoV-2 from a single human initially infected by an ancestral B.1.234 virus. The accumulation of convergent omicron-like mutations in WI-CL-001's ancestral B.1.234 genome probably reflects persistent infection and extensive within-host evolution. People who shed cryptic lineages could be an important source of highly divergent viruses that sporadically emerge and spread. FUNDING: The Rockefeller Foundation, Wisconsin Department of Health Services, Centers for Disease Control and Prevention, National Institute on Drug Abuse, and the Center for Research on Influenza Pathogenesis and Transmission.