RESUMO
Genomic characterization of an Escherichia coli O157:H7 strain linked to leafy greens-associated outbreaks dates its emergence to late 2015. One clade has notable accessory genomic content and a previously described mutation putatively associated with increased arsenic tolerance. This strain is a reoccurring, emerging, or persistent strain causing illness over an extended period.
Assuntos
Escherichia coli O157 , Escherichia coli O157/genética , Surtos de Doenças , Genômica , MutaçãoRESUMO
In May of 2018, PulseNet, the national molecular subtyping network for enteric pathogens, detected a multistate cluster of illnesses caused by an uncommon molecular subtype of Salmonella serovar Mbandaka. A case was defined as an illness in a person infected with the outbreak strain of Salmonella Mbandaka with illness onset on or after 3 March 2018 and before 1 September 2018. One-hundred thirty-six cases from 36 states were identified; 35 hospitalisations and no deaths were reported. Ill people ranged in age from <1 year to 95 years (median: 57 years). When standardised questionnaires did not generate a strong hypothesis, opened-ended interviews were performed. Sixty-three of 84 (75%) ultimately reported consuming or possibly consuming a specific sweetened puffed wheat cereal in the week before illness onset. Environmental sampling performed at the cereal manufacturing facility yielded the outbreak strain. The outbreak strain was also isolated from open cereal samples from ill people's homes and from a sealed retail sample. Due to these findings, the brand owner of the product issued a voluntary recall of the cereal on 14 June 2018. Additional investigation of the manufacturing facility identified persistent environmental contamination with Salmonella Mbandaka that was closely genetically related to other isolates in the outbreak. This investigation highlights the ability of Salmonella to survive in low-moisture environments, and the potential for prolonged outbreaks linked to products with long shelf lives and large distribution areas.
Assuntos
Intoxicação Alimentar por Salmonella , Infecções por Salmonella , Surtos de Doenças , Grão Comestível , Humanos , Lactente , Salmonella/genética , Intoxicação Alimentar por Salmonella/epidemiologia , Infecções por Salmonella/epidemiologia , Triticum , Estados Unidos/epidemiologiaRESUMO
In 2016, the proportion of Neisseria gonorrhoeae isolates with reduced susceptibility to azithromycin rose to 3.6%. A phylogenetic analysis of 334 N. gonorrhoeae isolates collected in 2016 revealed a single, geographically diverse lineage of isolates with MICs of 2 to 16 µg/ml that carried a mosaic-like mtr locus, whereas the majority of isolates with MICs of ≥16 µg/ml appeared sporadically and carried 23S rRNA mutations. Continued molecular surveillance of N. gonorrhoeae isolates will identify new resistance mechanisms.
Assuntos
Antibacterianos/farmacologia , Azitromicina/farmacologia , Farmacorresistência Bacteriana/genética , Neisseria gonorrhoeae/efeitos dos fármacos , Neisseria gonorrhoeae/genética , Vigilância de Evento Sentinela , Alelos , Loci Gênicos/genética , Gonorreia/epidemiologia , Gonorreia/microbiologia , Humanos , Testes de Sensibilidade Microbiana , Epidemiologia Molecular , RNA Ribossômico 23S/genética , Estados Unidos/epidemiologiaRESUMO
BACKGROUND: Given the lack of new antimicrobials or a vaccine, understanding the evolutionary dynamics of Neisseria gonorrhoeae is a significant public and global health priority. We investigated the emergence and spread of gonococcal strains with decreased susceptibility to cephalosporins and azithromycin using detailed genomic analyses of gonococcal isolates collected in the United States, 2014-2016. METHODS: We sequenced genomes of 649 isolates collected through the Gonococcal Isolate Surveillance Project. We examined the genetic relatedness of isolates and assessed associations between clades and various genotypic and phenotypic combinations. RESULTS: We identified a large and clonal lineage of strains (MLST ST9363) associated with elevated azithromycin minimum inhibitory concentration (AZIem), characterized by a mosaic mtr locus (C substitution in the mtrR promoter, mosaic mtrR and mtrD). Mutations in 23S rRNA were sporadically distributed among AZIem strains. Another clonal group (MLST ST1901) possessed 7 unique PBP2 patterns, and it shared common mutations in other genes associated with cephalosporin resistance. CONCLUSIONS: Whole-genome sequencing methods can enhance monitoring of antimicrobial resistant gonococcal strains by identifying gonococcal populations containing mutations of concern. These methods could inform the development of point-of-care diagnostic tests designed to determine the specific antibiotic susceptibility profile of a gonococcal infection in a patient.
Assuntos
Antibacterianos/uso terapêutico , Azitromicina/uso terapêutico , Cefalosporinas/uso terapêutico , Gonorreia/tratamento farmacológico , Neisseria gonorrhoeae/efeitos dos fármacos , Proteínas de Bactérias/genética , Farmacorresistência Bacteriana/efeitos dos fármacos , Evolução Molecular , Genômica , Genótipo , Gonorreia/microbiologia , Humanos , Masculino , Testes de Sensibilidade Microbiana/métodos , Mutação/efeitos dos fármacos , Mutação/genética , Neisseria gonorrhoeae/genética , Fenótipo , Regiões Promotoras Genéticas/efeitos dos fármacos , Regiões Promotoras Genéticas/genética , RNA Ribossômico 23S/genética , Estados Unidos , Sequenciamento Completo do Genoma/métodosRESUMO
Campylobacter is a leading causing of bacterial foodborne and zoonotic illnesses in the USA. Pulsed-field gene electrophoresis (PFGE) and 7-gene multilocus sequence typing (MLST) have been historically used to differentiate sporadic from outbreak Campylobacter isolates. Whole genome sequencing (WGS) has been shown to provide superior resolution and concordance with epidemiological data when compared with PFGE and 7-gene MLST during outbreak investigations. In this study, we evaluated epidemiological concordance for high-quality SNP (hqSNP), core genome (cg)MLST and whole genome (wg)MLST to cluster or differentiate outbreak-associated and sporadic Campylobacter jejuni and Campylobacter coli isolates. Phylogenetic hqSNP, cgMLST and wgMLST analyses were also compared using Baker's gamma index (BGI) and cophenetic correlation coefficients. Pairwise distances comparing all three analysis methods were compared using linear regression models. Our results showed that 68/73 sporadic C. jejuni and C. coli isolates were differentiated from outbreak-associated isolates using all three methods. There was a high correlation between cgMLST and wgMLST analyses of the isolates; the BGI, cophenetic correlation coefficient, linear regression model R 2 and Pearson correlation coefficients were >0.90. The correlation was sometimes lower comparing hqSNP analysis to the MLST-based methods; the linear regression model R 2 and Pearson correlation coefficients were between 0.60 and 0.86, and the BGI and cophenetic correlation coefficient were between 0.63 and 0.86 for some outbreak isolates. We demonstrated that C. jejuni and C. coli isolates clustered in concordance with epidemiological data using WGS-based analysis methods. Discrepancies between allele and SNP-based approaches may reflect the differences between how genomic variation (SNPs and indels) are captured between the two methods. Since cgMLST examines allele differences in genes that are common in most isolates being compared, it is well suited to surveillance: searching large genomic databases for similar isolates is easily and efficiently done using allelic profiles. On the other hand, use of an hqSNP approach is much more computer intensive and not scalable to large sets of genomes. If further resolution between potential outbreak isolates is needed, wgMLST or hqSNP analysis can be used.
Assuntos
Campylobacter coli , Campylobacter jejuni , Estados Unidos/epidemiologia , Tipagem de Sequências Multilocus , Campylobacter coli/genética , Filogenia , Surtos de DoençasRESUMO
Salmonella enterica is a leading cause of bacterial foodborne and zoonotic illnesses in the United States. For this study, we applied four different whole genome sequencing (WGS)-based subtyping methods: high quality single-nucleotide polymorphism (hqSNP) analysis, whole genome multilocus sequence typing using either all loci [wgMLST (all loci)] and only chromosome-associated loci [wgMLST (chrom)], and core genome multilocus sequence typing (cgMLST) to a dataset of isolate sequences from 9 well-characterized Salmonella outbreaks. For each outbreak, we evaluated the genomic and epidemiologic concordance between hqSNP and allele-based methods. We first compared pairwise genomic differences using all four methods. We observed discrepancies in allele difference ranges when using wgMLST (all loci), likely caused by inflated genetic variation due to loci found on plasmids and/or other mobile genetic elements in the accessory genome. Therefore, we excluded wgMLST (all loci) results from any further comparisons in the study. Then, we created linear regression models and phylogenetic tanglegrams using the remaining three methods. K-means analysis using the silhouette method was applied to compare the ability of the three methods to partition outbreak and sporadic isolate sequences. Our results showed that pairwise hqSNP differences had high concordance with cgMLST and wgMLST (chrom) allele differences. The slopes of the regressions for hqSNP vs. allele pairwise differences were 0.58 (cgMLST) and 0.74 [wgMLST (chrom)], and the slope of the regression was 0.77 for cgMLST vs. wgMLST (chrom) pairwise differences. Tanglegrams showed high clustering concordance between methods using two statistical measures, the Baker's gamma index (BGI) and cophenetic correlation coefficient (CCC), where 9/9 (100%) of outbreaks yielded BGI values ≥ 0.60 and CCCs were ≥ 0.97 across all nine outbreaks and all three methods. K-means analysis showed separation of outbreak and sporadic isolate groups with average silhouette widths ≥ 0.87 for outbreak groups and ≥ 0.16 for sporadic groups. This study demonstrates that Salmonella isolates clustered in concordance with epidemiologic data using three WGS-based subtyping methods and supports using cgMLST as the primary method for national surveillance of Salmonella outbreak clusters.
RESUMO
The COVID-19 pandemic had disproportionate effects on the Veteran population due to the increased prevalence of medical and environmental risk factors. Synthetic electronic health record (EHR) data can help meet the acute need for Veteran population-specific predictive modeling efforts by avoiding the strict barriers to access, currently present within Veteran Health Administration (VHA) datasets. The U.S. Food and Drug Administration (FDA) and the VHA launched the precisionFDA COVID-19 Risk Factor Modeling Challenge to develop COVID-19 diagnostic and prognostic models; identify Veteran population-specific risk factors; and test the usefulness of synthetic data as a substitute for real data. The use of synthetic data boosted challenge participation by providing a dataset that was accessible to all competitors. Models trained on synthetic data showed similar but systematically inflated model performance metrics to those trained on real data. The important risk factors identified in the synthetic data largely overlapped with those identified from the real data, and both sets of risk factors were validated in the literature. Tradeoffs exist between synthetic data generation approaches based on whether a real EHR dataset is required as input. Synthetic data generated directly from real EHR input will more closely align with the characteristics of the relevant cohort. This work shows that synthetic EHR data will have practical value to the Veterans' health research community for the foreseeable future.
Assuntos
Transportadores de Cassetes de Ligação de ATP/genética , Antibacterianos/farmacologia , Proteínas de Bactérias/genética , Farmacorresistência Bacteriana Múltipla/genética , Neisseria gonorrhoeae/efeitos dos fármacos , Neisseria gonorrhoeae/genética , Genoma Bacteriano/genética , Humanos , Testes de Sensibilidade Microbiana , Neisseria gonorrhoeae/isolamento & purificação , Sequenciamento Completo do GenomaRESUMO
Background: Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the cause of coronavirus disease 2019 (COVID-19), has spread globally and is being surveilled with an international genome sequencing effort. Surveillance consists of sample acquisition, library preparation, and whole genome sequencing. This has necessitated a classification scheme detailing Variants of Concern (VOC) and Variants of Interest (VOI), and the rapid expansion of bioinformatics tools for sequence analysis. These bioinformatic tools are means for major actionable results: maintaining quality assurance and checks, defining population structure, performing genomic epidemiology, and inferring lineage to allow reliable and actionable identification and classification. Additionally, the pandemic has required public health laboratories to reach high throughput proficiency in sequencing library preparation and downstream data analysis rapidly. However, both processes can be limited by a lack of a standardized sequence dataset. Methods: We identified six SARS-CoV-2 sequence datasets from recent publications, public databases and internal resources. In addition, we created a method to mine public databases to identify representative genomes for these datasets. Using this novel method, we identified several genomes as either VOI/VOC representatives or non-VOI/VOC representatives. To describe each dataset, we utilized a previously published datasets format, which describes accession information and whole dataset information. Additionally, a script from the same publication has been enhanced to download and verify all data from this study. Results: The benchmark datasets focus on the two most widely used sequencing platforms: long read sequencing data from the Oxford Nanopore Technologies platform and short read sequencing data from the Illumina platform. There are six datasets: three were derived from recent publications; two were derived from data mining public databases to answer common questions not covered by published datasets; one unique dataset representing common sequence failures was obtained by rigorously scrutinizing data that did not pass quality checks. The dataset summary table, data mining script and quality control (QC) values for all sequence data are publicly available on GitHub: https://github.com/CDCgov/datasets-sars-cov-2. Discussion: The datasets presented here were generated to help public health laboratories build sequencing and bioinformatics capacity, benchmark different workflows and pipelines, and calibrate QC thresholds to ensure sequencing quality. Together, improvements in these areas support accurate and timely outbreak investigation and surveillance, providing actionable data for pandemic management. Furthermore, these publicly available and standardized benchmark data will facilitate the development and adjudication of new pipelines.
Assuntos
COVID-19 , SARS-CoV-2 , Humanos , SARS-CoV-2/genética , COVID-19/epidemiologia , Benchmarking , Biologia Computacional , Análise de SequênciaRESUMO
Laboratories that run Whole Genome Sequencing (WGS) produce a tremendous amount of data, up to 10 gigabytes for some common instruments. There is a need to standardize the quality assurance and quality control process (QA/QC). Therefore we have created SneakerNet to automate the QA/QC for WGS.
RESUMO
The objective of this study was to determine sources of Shiga toxin-producing Escherichia coli O157 (STEC O157) infection among visitors to Farm X and develop public health recommendations. A case-control study was conducted. Case-patients were defined as the first ill child (aged <18 years) in the household with laboratory-confirmed STEC O157, or physician-diagnosed hemolytic uremic syndrome with laboratory confirmation by serology, who visited Farm X in the 10 days prior to illness. Controls were selected from Farm X visitors aged <18 years, without symptoms during the same time period as case-patients. Environment and animal fecal samples collected from Farm X were cultured; isolates from Farm X were compared with patient isolates using whole genome sequencing (WGS). Case-patients were more likely than controls to have sat on hay bales at the doe barn (adjusted odds ratio: 4.55; 95% confidence interval: 1.41-16.13). No handwashing stations were available; limited hand sanitizer was provided. Overall, 37% (29 of 78) of animal and environmental samples collected were positive for STEC; of these, 62% (18 of 29) yielded STEC O157 highly related by WGS to patient isolates. STEC O157 environmental contamination and fecal shedding by goats at Farm X was extensive. Farms should provide handwashing stations with soap, running water, and disposable towels. Access to animal areas, including animal pens and enclosures, should be limited for young children who are at risk for severe outcomes from STEC O157 infection. National recommendations should be adopted to reduce disease transmission.