Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 83
Filtrar
Mais filtros

Bases de dados
Tipo de documento
Intervalo de ano de publicação
1.
Emerg Infect Dis ; 30(4): 701-710, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38526070

RESUMO

Salmonella enterica serovar Infantis presents an ever-increasing threat to public health because of its spread throughout many countries and association with high levels of antimicrobial resistance (AMR). We analyzed whole-genome sequences of 5,284 Salmonella Infantis strains from 74 countries, isolated during 1989-2020 from a wide variety of human, animal, and food sources, to compare genetic phylogeny, AMR determinants, and plasmid presence. The global Salmonella Infantis population structure diverged into 3 clusters: a North American cluster, a European cluster, and a global cluster. The levels of AMR varied by Salmonella Infantis cluster and by isolation source; 73% of poultry isolates were multidrug resistant, compared with 35% of human isolates. This finding correlated with the presence of the pESI megaplasmid; 71% of poultry isolates contained pESI, compared with 32% of human isolates. This study provides key information for public health teams engaged in reducing the spread of this pathogen.


Assuntos
Saúde Única , Salmonella enterica , Animais , Humanos , Sorogrupo , Antibacterianos/farmacologia , Salmonella/genética , Aves Domésticas , Farmacorresistência Bacteriana Múltipla/genética
2.
Clin Infect Dis ; 73(8): 1537-1539, 2021 10 20.
Artigo em Inglês | MEDLINE | ID: mdl-34240118

RESUMO

Open-source DNA sequence databases have long been touted as beneficial to public health, including the facilitation of earlier detection and response to infectious disease outbreaks. Of critical importance to harnessing these benefits is the metadata that describe general and other domain-specific attributes (eg, collection location, isolate type) of a sample. Unlike the sequence data, metadata are often incomplete and lack adherence to an international standard. Here, we describe the problem posed by such variable and incomplete metadata in terms of interpretative labor costs (the time and energy necessary to make sense of the signal in the genetic data) and the impact such metadata have on foodborne outbreak detection and response. Improving the quality of sequence-associated metadata would allow for earlier detection of emerging food safety hazards and allow faster response to foodborne outbreaks.


Assuntos
Doenças Transmitidas por Alimentos , Metadados , Surtos de Doenças , Inocuidade dos Alimentos , Doenças Transmitidas por Alimentos/epidemiologia , Humanos , Saúde Pública , Vigilância em Saúde Pública
3.
Appl Environ Microbiol ; 87(3)2021 01 15.
Artigo em Inglês | MEDLINE | ID: mdl-33187991

RESUMO

Vibrio parahaemolyticus is the most common cause of seafood-borne illness reported in the United States. The draft genomes of 132 North American clinical and oyster V. parahaemolyticus isolates were sequenced to investigate their phylogenetic and biogeographic relationships. The majority of oyster isolate sequence types (STs) were from a single harvest location; however, four were identified from multiple locations. There was population structure along the Gulf and Atlantic Coasts of North America, with what seemed to be a hub of genetic variability along the Gulf Coast, with some of the same STs occurring along the Atlantic Coast and one shared between the coastal waters of the Gulf and those of Washington State. Phylogenetic analyses found nine well-supported clades. Two clades were composed of isolates from both clinical and oyster sources. Four were composed of isolates entirely from clinical sources, and three were entirely from oyster sources. Each single-source clade consisted of one ST. Some human isolates lack tdh, trh, and some type III secretion system (T3SS) genes, which are established virulence genes of V. parahaemolyticus Thus, these genes are not essential for pathogenicity. However, isolates in the monophyletic groups from clinical sources were enriched in several categories of genes compared to those from monophyletic groups of oyster isolates. These functional categories include cell signaling, transport, and metabolism. The identification of genes in these functional categories provides a basis for future in-depth pathogenicity investigations of V. parahaemolyticusIMPORTANCEVibrio parahaemolyticus is the most common cause of seafood-borne illness reported in the United States and is frequently associated with shellfish consumption. This study contributes to our knowledge of the biogeography and functional genomics of this species around North America. STs shared between the Gulf Coast and the Atlantic seaboard as well as Pacific waters suggest possible transport via oceanic currents or large shipping vessels. STs frequently isolated from humans but rarely, if ever, isolated from the environment are likely more competitive in the human gut than other STs. This could be due to additional functional capabilities in areas such as cell signaling, transport, and metabolism, which may give these isolates an advantage in novel nutrient-replete environments such as the human gut.


Assuntos
Vibrio parahaemolyticus/genética , Animais , Monitoramento Biológico , Genes Bacterianos , Genoma Bacteriano , Humanos , América do Norte , Ostreidae/microbiologia , Filogenia , Vibrioses/microbiologia , Vibrio parahaemolyticus/isolamento & purificação , Virulência/genética , Sequenciamento Completo do Genoma
4.
Plant Dis ; 105(11): 3554-3563, 2021 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-33599513

RESUMO

Fire blight, caused by the bacterium Erwinia amylovora, is one of the most important diseases of apple. The antibiotic streptomycin is routinely used in the commercial apple industries of New York (NY) and New England to manage the disease. In 2002 and again, from 2011 to 2014, outbreaks of streptomycin resistance (SmR) were reported and investigated in NY. Motivated by new grower reports of control failures, we conducted a follow-up investigation of the distribution of SmR and E. amylovora strains for major apple production regions of NY over the last 6 years (2015 to 2020). Characterization of clustered regularly interspaced short palindromic repeat (CRISPR) profiles revealed that a few "cosmopolitan" strains were widely prevalent across regions, whereas many other "resident" strains were confined to one location. In addition, we uncovered novel CRISPR profile diversity in all investigated regions. SmR E. amylovora was detected only in a small area spanning two counties from 2017 to 2020 and was always associated with one CRISPR profile (41:23:38), which matched the profile of SmR E. amylovora, discovered in 2002. This suggests the original SmR E. amylovora was never fully eradicated and went undetected because of several seasons of low disease pressure in this region. Investigation of several representative isolates under controlled greenhouse conditions indicated significant differences in aggressiveness on 'Gala' apples. Potential implications of strain differences include the propensity of strains to become distributed across wide geographic regions and associated resistance management practices. Results from this work will directly influence sustainable fire blight management recommendations for commercial apple industries in NY state and other regions.


Assuntos
Erwinia amylovora , Malus , Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas , Erwinia amylovora/genética , Seguimentos , Malus/genética , New York , Doenças das Plantas , Estreptomicina/farmacologia
5.
J Clin Microbiol ; 57(5)2019 05.
Artigo em Inglês | MEDLINE | ID: mdl-30728194

RESUMO

Foodborne pathogen surveillance in the United States is transitioning from strain identification using restriction digest technology (pulsed-field gel electrophoresis [PFGE]) to shotgun sequencing of the entire genome (whole-genome sequencing [WGS]). WGS requires a new suite of analysis tools, some of which have long histories in academia but are new to the field of public health and regulatory decision making. Although the general workflow is fairly standard for collecting and analyzing WGS data for disease surveillance, there are a number of differences in how the data are collected and analyzed across public health agencies, both nationally and internationally. This impedes collaborative public health efforts, so national and international efforts are underway to enable direct comparison of these different analysis methods. Ultimately, the harmonization efforts will allow the (mutually trusted and understood) production and analysis of WGS data by labs and agencies worldwide, thus improving outbreak response capabilities globally. This review provides a historical perspective on the use of WGS for pathogen tracking and summarizes the efforts underway to ensure the major steps in phylogenomic pipelines used for pathogen disease surveillance can be readily validated. The tools for doing this will ensure that the results produced are sound, reproducible, and comparable across different analytic approaches.


Assuntos
Bactérias/genética , Análise de Dados , Doenças Transmitidas por Alimentos/diagnóstico , Filogenia , Bactérias/patogenicidade , Biologia Computacional/métodos , Biologia Computacional/normas , Surtos de Doenças/prevenção & controle , Eletroforese em Gel de Campo Pulsado , Monitoramento Epidemiológico , Genoma Bacteriano , Humanos , Saúde Pública , Estados Unidos , Sequenciamento Completo do Genoma
6.
Food Microbiol ; 70: 113-119, 2018 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-29173617

RESUMO

Globally, unpasteurized milk products are vehicles for the transmission of brucellosis, a zoonosis responsible for cases of foodborne illness in the United States and elsewhere. Existing PCR assays to detect Brucella species are restricted by the resolution of band sizes on a gel or the number of fluorescent channels in a single real-time system. The Luminex bead-based suspension array is performed in a 96-well plate allowing for high throughput screening of up to 100 targets in one sample with easily discernible results. We have developed an array using the Bio-Plex 200 to differentiate the most common Brucella species: B. abortus, B. melitensis, B. suis, B. suis bv5, B. canis, B. ovis, B. pinnipedia, and B. neotomae, as well as Brucella genus. All probes showed high specificity, with no cross-reaction with non-Brucella strains. We could detect pure DNA from B. abortus, B. melitensis, and genus-level Brucella at concentrations of ≤5 fg/µL. Pure DNA from all other species tested positive at concentrations well below 500 fg/µL and we positively identified B. neotomae in six artificially contaminated cheese and milk products. An intra-laboratory verification further demonstrated the assay's accuracy and robustness in the rapid screening (3-4 h including PCR) of DNA.


Assuntos
Técnicas de Tipagem Bacteriana/métodos , Brucella/isolamento & purificação , Brucelose/microbiologia , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Animais , Técnicas de Tipagem Bacteriana/instrumentação , Brucella/classificação , Brucella/genética , Brucelose/transmissão , DNA Bacteriano/genética , Humanos , Leite/microbiologia , Análise de Sequência com Séries de Oligonucleotídeos/instrumentação , Sensibilidade e Especificidade , Ovinos
7.
BMC Bioinformatics ; 18(1): 178, 2017 Mar 20.
Artigo em Inglês | MEDLINE | ID: mdl-28320310

RESUMO

BACKGROUND: Using phylogenomic analysis tools for tracking pathogens has become standard practice in academia, public health agencies, and large industries. Using the same raw read genomic data as input, there are several different approaches being used to infer phylogenetic tree. These include many different SNP pipelines, wgMLST approaches, k-mer algorithms, whole genome alignment and others; each of these has advantages and disadvantages, some have been extensively validated, some are faster, some have higher resolution. A few of these analysis approaches are well-integrated into the regulatory process of US Federal agencies (e.g. the FDA's SNP pipeline for tracking foodborne pathogens). However, despite extensive validation on benchmark datasets and comparison with other pipelines, we lack methods for fully exploring the effects of multiple parameter values in each pipeline that can potentially have an effect on whether the correct phylogenetic tree is recovered. RESULTS: To resolve this problem, we offer a program, TreeToReads, which can generate raw read data from mutated genomes simulated under a known phylogeny. This simulation pipeline allows direct comparisons of simulated and observed data in a controlled environment. At each step of these simulations, researchers can vary parameters of interest (e.g., input tree topology, amount of sequence divergence, rate of indels, read coverage, distance of reference genome, etc) to assess the effects of various parameter values on correctly calling SNPs and reconstructing an accurate tree. CONCLUSIONS: Such critical assessments of the accuracy and robustness of analytical pipelines are essential to progress in both research and applied settings.


Assuntos
Genômica/métodos , Filogenia
8.
J Clin Microbiol ; 55(3): 931-941, 2017 03.
Artigo em Inglês | MEDLINE | ID: mdl-28053218

RESUMO

Three multistate outbreaks between 2014 and 2016, involving case patients in and outside the United States, were linked to stone fruit, caramel apples, and packaged leafy green salad contaminated with Listeria monocytogenes singleton sequence type 382 (ST382), a serotype IVb-v1 clone with limited genomic divergence. Isolates from these outbreaks and other ST382 isolates not associated with these outbreaks were analyzed by whole-genome sequencing (WGS) analysis. The primary differences among ST382 strains were single nucleotide polymorphisms (SNPs). WGS analysis differentiated ST382 from a clonal complex 1 outbreak strain co-contaminating the caramel apples. WGS clustered food, environmental, and clinical isolates within each outbreak, and also differentiated among the three outbreak strains and epidemiologically unrelated ST382 isolates, which were indistinguishable by pulsed-field gel electrophoresis. ST382 appeared to be an emerging clone that began to diverge from its ancestor approximately 32 years before 2016. We estimated that there was 1.29 nucleotide substitution per genome (2.94 Mbp) per year for this clone.


Assuntos
Surtos de Doenças , Microbiologia de Alimentos , Doenças Transmitidas por Alimentos/epidemiologia , Genótipo , Listeria monocytogenes/classificação , Listeriose/epidemiologia , Tipagem de Sequências Multilocus , Adolescente , Idoso , Criança , Pré-Escolar , Análise por Conglomerados , DNA Bacteriano/química , DNA Bacteriano/genética , Feminino , Doenças Transmitidas por Alimentos/microbiologia , Genoma Bacteriano , Humanos , Listeria monocytogenes/genética , Listeria monocytogenes/isolamento & purificação , Listeriose/microbiologia , Masculino , Epidemiologia Molecular , Polimorfismo de Nucleotídeo Único , Estados Unidos
9.
Appl Environ Microbiol ; 83(15)2017 Aug 01.
Artigo em Inglês | MEDLINE | ID: mdl-28550058

RESUMO

Epidemiological findings of a listeriosis outbreak in 2013 implicated Hispanic-style cheese produced by company A, and pulsed-field gel electrophoresis (PFGE) and whole genome sequencing (WGS) were performed on clinical isolates and representative isolates collected from company A cheese and environmental samples during the investigation. The results strengthened the evidence for cheese as the vehicle. Surveillance sampling and WGS 3 months later revealed that the equipment purchased by company B from company A yielded an environmental isolate highly similar to all outbreak isolates. The whole genome and core genome multilocus sequence typing and single nucleotide polymorphism (SNP) analyses results were compared to demonstrate the maximum discriminatory power obtained by using multiple analyses, which were needed to differentiate outbreak-associated isolates from a PFGE-indistinguishable isolate collected in a nonimplicated food source in 2012. This unrelated isolate differed from the outbreak isolates by only 7 to 14 SNPs, and as a result, the minimum spanning tree from the whole genome analyses and certain variant calling approach and phylogenetic algorithm for core genome-based analyses could not provide differentiation between unrelated isolates. Our data also suggest that SNP/allele counts should always be combined with WGS clustering analysis generated by phylogenetically meaningful algorithms on a sufficient number of isolates, and the SNP/allele threshold alone does not provide sufficient evidence to delineate an outbreak. The putative prophages were conserved across all the outbreak isolates. All outbreak isolates belonged to clonal complex 5 and serotype 1/2b and had an identical inlA sequence which did not have premature stop codons.IMPORTANCE In this outbreak, multiple analytical approaches were used for maximum discriminatory power. A PFGE-matched, epidemiologically unrelated isolate had high genetic similarity to the outbreak-associated isolates, with as few as 7 SNP differences. Therefore, the SNP/allele threshold should not be used as the only evidence to define the scope of an outbreak. It is critical that the SNP/allele counts be complemented by WGS clustering analysis generated by phylogenetically meaningful algorithms to distinguish outbreak-associated isolates from epidemiologically unrelated isolates. Careful selection of a variant calling approach and phylogenetic algorithm is critical for core-genome-based analyses. The whole-genome-based analyses were able to construct the highly resolved phylogeny needed to support the findings of the outbreak investigation. Ultimately, epidemiologic evidence and multiple WGS analyses should be combined to increase confidence levels during outbreak investigations.

10.
Planta Med ; 83(18): 1420-1430, 2017 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-28651291

RESUMO

Precise, species-level identification of plants in foods and dietary supplements is difficult. While the use of DNA barcoding regions (short regions of DNA with diagnostic utility) has been effective for many inquiries, it is not always a robust approach for closely related species, especially in highly processed products. The use of fully sequenced chloroplast genomes, as an alternative to short diagnostic barcoding regions, has demonstrated utility for closely related species. The U. S. Food and Drug Administration (FDA) has also developed species-specific DNA-based assays targeting plant species of interest by utilizing chloroplast genome sequences. Here, we introduce a repository of complete chloroplast genome sequences called GenomeTrakrCP, which will be publicly available at the National Center for Biotechnology Information (NCBI). Target species for inclusion are plants found in foods and dietary supplements, toxin producers, common contaminants and adulterants, and their close relatives. Publicly available data will include annotated assemblies, raw sequencing data, and voucher information with each NCBI accession associated with an authenticated reference herbarium specimen. To date, 40 complete chloroplast genomes have been deposited in GenomeTrakrCP (https://www.ncbi.nlm.nih.gov/bioproject/PRJNA325670/), and this will be expanded in the future.


Assuntos
Bases de Dados de Ácidos Nucleicos/normas , Genoma de Cloroplastos/genética , Plantas/classificação , Código de Barras de DNA Taxonômico , DNA de Cloroplastos/química , DNA de Cloroplastos/genética , Anotação de Sequência Molecular , Folhas de Planta/classificação , Folhas de Planta/genética , Plantas/genética , Padrões de Referência , Especificidade da Espécie , Estados Unidos , United States Food and Drug Administration
11.
Clin Infect Dis ; 63(3): 380-6, 2016 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-27090985

RESUMO

Listeria monocytogenes (Lm) causes severe foodborne illness (listeriosis). Previous molecular subtyping methods, such as pulsed-field gel electrophoresis (PFGE), were critical in detecting outbreaks that led to food safety improvements and declining incidence, but PFGE provides limited genetic resolution. A multiagency collaboration began performing real-time, whole-genome sequencing (WGS) on all US Lm isolates from patients, food, and the environment in September 2013, posting sequencing data into a public repository. Compared with the year before the project began, WGS, combined with epidemiologic and product trace-back data, detected more listeriosis clusters and solved more outbreaks (2 outbreaks in pre-WGS year, 5 in WGS year 1, and 9 in year 2). Whole-genome multilocus sequence typing and single nucleotide polymorphism analyses provided equivalent phylogenetic relationships relevant to investigations; results were most useful when interpreted in context of epidemiological data. WGS has transformed listeriosis outbreak surveillance and is being implemented for other foodborne pathogens.


Assuntos
Surtos de Doenças , Doenças Transmitidas por Alimentos/epidemiologia , Genoma Bacteriano/genética , Listeria monocytogenes/classificação , Listeriose/epidemiologia , Sequenciamento Completo do Genoma/métodos , Inocuidade dos Alimentos , Doenças Transmitidas por Alimentos/microbiologia , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Listeria monocytogenes/genética , Listeria monocytogenes/isolamento & purificação , Listeriose/microbiologia , Tipagem de Sequências Multilocus , Filogenia , Análise de Sequência de DNA
12.
J Clin Microbiol ; 54(8): 1975-83, 2016 08.
Artigo em Inglês | MEDLINE | ID: mdl-27008877

RESUMO

The FDA has created a United States-based open-source whole-genome sequencing network of state, federal, international, and commercial partners. The GenomeTrakr network represents a first-of-its-kind distributed genomic food shield for characterizing and tracing foodborne outbreak pathogens back to their sources. The GenomeTrakr network is leading investigations of outbreaks of foodborne illnesses and compliance actions with more accurate and rapid recalls of contaminated foods as well as more effective monitoring of preventive controls for food manufacturing environments. An expanded network would serve to provide an international rapid surveillance system for pathogen traceback, which is critical to support an effective public health response to bacterial outbreaks.


Assuntos
Surtos de Doenças , Microbiologia de Alimentos/métodos , Inocuidade dos Alimentos/métodos , Doenças Transmitidas por Alimentos/epidemiologia , Doenças Transmitidas por Alimentos/prevenção & controle , Genômica/métodos , Humanos , Estados Unidos/epidemiologia
14.
Appl Environ Microbiol ; 82(24): 7030-7040, 2016 12 15.
Artigo em Inglês | MEDLINE | ID: mdl-27694232

RESUMO

In 2014, the identification of stone fruits contaminated with Listeria monocytogenes led to the subsequent identification of a multistate outbreak. Simultaneous detection and enumeration of L. monocytogenes were performed on 105 fruits, each weighing 127 to 145 g, collected from 7 contaminated lots. The results showed that 53.3% of the fruits yielded L. monocytogenes (lower limit of detection, 5 CFU/fruit), and the levels ranged from 5 to 2,850 CFU/fruit, with a geometric mean of 11.3 CFU/fruit (0.1 CFU/g of fruit). Two serotypes, IVb-v1 and 1/2b, were identified by a combination of PCR- and antiserum-based serotyping among isolates from fruits and their packing environment; certain fruits contained a mixture of both serotypes. Single nucleotide polymorphism (SNP)-based whole-genome sequencing (WGS) analysis clustered isolates from two case-patients with the serotype IVb-v1 isolates and distinguished outbreak-associated isolates from pulsed-field gel electrophoresis (PFGE)-matched, but epidemiologically unrelated, clinical isolates. The outbreak-associated isolates differed by up to 42 SNPs. All but one serotype 1/2b isolate formed another WGS cluster and differed by up to 17 SNPs. Fully closed genomes of isolates from the stone fruits were used as references to maximize the resolution and to increase our confidence in prophage analysis. Putative prophages were conserved among isolates of each WGS cluster. All serotype IVb-v1 isolates belonged to singleton sequence type 382 (ST382); all but one serotype 1/2b isolate belonged to clonal complex 5. IMPORTANCE: WGS proved to be an excellent tool to assist in the epidemiologic investigation of listeriosis outbreaks. The comparison at the genome level contributed to our understanding of the genetic diversity and variations among isolates involved in an outbreak or isolates associated with food and environmental samples from one facility. Fully closed genomes increased our confidence in the identification and comparison of accessory genomes. The diversity among the outbreak-associated isolates and the inclusion of PFGE-matched, but epidemiologically unrelated, isolates demonstrate the high resolution of WGS. The prevalence and enumeration data could contribute to our further understanding of the risk associated with Listeria monocytogenes contamination, especially among high-risk populations.


Assuntos
Contaminação de Alimentos/análise , Frutas/microbiologia , Genoma Bacteriano , Listeria monocytogenes/genética , Listeria monocytogenes/isolamento & purificação , Técnicas de Tipagem Bacteriana , Eletroforese em Gel de Campo Pulsado , Listeria monocytogenes/classificação , Listeria monocytogenes/crescimento & desenvolvimento , Filogenia , Polimorfismo de Nucleotídeo Único
15.
Microbiology (Reading) ; 161(2): 374-386, 2015 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-28206902

RESUMO

Prokaryotic CRISPR-Cas (clustered regularly interspaced short palindromic repeats and CRISPR-associated genes) systems provide adaptive immunity from invasive genetic elements and encompass three essential features: (i) cas genes, (ii) a CRISPR array composed of spacers and direct repeats and (iii) an AT-rich leader sequence upstream of the array. We performed in-depth sequence analysis of the CRISPR-Cas systems in >600 Salmonella, representing four clinically prevalent serovars. Each CRISPR-Cas feature is extremely conserved in the Salmonella, and the CRISPR1 locus is more highly conserved than CRISPR2. Array composition is serovar-specific, although no convincing evidence of recent spacer acquisition against exogenous nucleic acids exists. Only 12 % of spacers match phage and plasmid sequences and self-targeting spacers are associated with direct repeat variants. High nucleotide identity (>99.9 %) exists across the cas operon among isolates of a single serovar and in some cases this conservation extends across divergent serovars. These observations reflect historical CRISPR-Cas immune activity, showing that this locus has ceased undergoing adaptive events. Intriguingly, the high level of conservation across divergent serovars shows that the genetic integrity of these inactive loci is maintained over time, contrasting with the canonical view that inactive CRISPR loci degenerate over time. This thorough characterization of Salmonella CRISPR-Cas systems presents new insights into Salmonella CRISPR evolution, particularly with respect to cas gene conservation, leader sequences, organization of direct repeats and protospacer matches. Collectively, our data suggest that Salmonella CRISPR-Cas systems are no longer immunogenic; rather, their impressive conservation indicates they may have an alternative function in Salmonella.

16.
Microbiology (Reading) ; 161(Pt 2): 374-86, 2015 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-25479838

RESUMO

Prokaryotic CRISPR-Cas (clustered regularly interspaced short palindromic repeats and CRISPR-associated genes) systems provide adaptive immunity from invasive genetic elements and encompass three essential features: (i) cas genes, (ii) a CRISPR array composed of spacers and direct repeats and (iii) an AT-rich leader sequence upstream of the array. We performed in-depth sequence analysis of the CRISPR-Cas systems in >600 Salmonella, representing four clinically prevalent serovars. Each CRISPR-Cas feature is extremely conserved in the Salmonella, and the CRISPR1 locus is more highly conserved than CRISPR2. Array composition is serovar-specific, although no convincing evidence of recent spacer acquisition against exogenous nucleic acids exists. Only 12% of spacers match phage and plasmid sequences and self-targeting spacers are associated with direct repeat variants. High nucleotide identity (>99.9%) exists across the cas operon among isolates of a single serovar and in some cases this conservation extends across divergent serovars. These observations reflect historical CRISPR-Cas immune activity, showing that this locus has ceased undergoing adaptive events. Intriguingly, the high level of conservation across divergent serovars shows that the genetic integrity of these inactive loci is maintained over time, contrasting with the canonical view that inactive CRISPR loci degenerate over time. This thorough characterization of Salmonella CRISPR-Cas systems presents new insights into Salmonella CRISPR evolution, particularly with respect to cas gene conservation, leader sequences, organization of direct repeats and protospacer matches. Collectively, our data suggest that Salmonella CRISPR-Cas systems are no longer immunogenic; rather, their impressive conservation indicates they may have an alternative function in Salmonella.


Assuntos
Sistemas CRISPR-Cas , Evolução Molecular , Salmonella/genética , Técnicas de Tipagem Bacteriana , Sequência de Bases , DNA Bacteriano/genética , Variação Genética , Humanos , Dados de Sequência Molecular , Polimorfismo de Nucleotídeo Único , Salmonella/classificação , Salmonella/isolamento & purificação , Infecções por Salmonella/microbiologia
17.
Appl Environ Microbiol ; 80(7): 2125-32, 2014 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-24463972

RESUMO

Clostridium botulinum is a genetically diverse Gram-positive bacterium producing extremely potent neurotoxins (botulinum neurotoxins A through G [BoNT/A-G]). The complete genome sequences of three strains harboring only the BoNT/A1 nucleotide sequence are publicly available. Although these strains contain a toxin cluster (HA(+) OrfX(-)) associated with hemagglutinin genes, little is known about the genomes of subtype A1 strains (termed HA(-) OrfX(+)) that lack hemagglutinin genes in the toxin gene cluster. We sequenced the genomes of three BoNT/A1-producing C. botulinum strains: two strains with the HA(+) OrfX(-) cluster (69A and 32A) and one strain with the HA(-) OrfX(+) cluster (CDC297). Whole-genome phylogenic single-nucleotide-polymorphism (SNP) analysis of these strains along with other publicly available C. botulinum group I strains revealed five distinct lineages. Strains 69A and 32A clustered with the C. botulinum type A1 Hall group, and strain CDC297 clustered with the C. botulinum type Ba4 strain 657. This study reports the use of whole-genome SNP sequence analysis for discrimination of C. botulinum group I strains and demonstrates the utility of this analysis in quickly differentiating C. botulinum strains harboring identical toxin gene subtypes. This analysis further supports previous work showing that strains CDC297 and 657 likely evolved from a common ancestor and independently acquired separate BoNT/A1 toxin gene clusters at distinct genomic locations.


Assuntos
Técnicas Bacteriológicas/métodos , Clostridium botulinum/classificação , Clostridium botulinum/genética , Genoma Bacteriano , Polimorfismo de Nucleotídeo Único , Análise por Conglomerados , DNA Bacteriano/química , DNA Bacteriano/genética , Genótipo , Técnicas de Diagnóstico Molecular/métodos , Dados de Sequência Molecular , Análise de Sequência de DNA
18.
mSystems ; 9(6): e0141523, 2024 Jun 18.
Artigo em Inglês | MEDLINE | ID: mdl-38819130

RESUMO

Wastewater surveillance has emerged as a crucial public health tool for population-level pathogen surveillance. Supported by funding from the American Rescue Plan Act of 2021, the FDA's genomic epidemiology program, GenomeTrakr, was leveraged to sequence SARS-CoV-2 from wastewater sites across the United States. This initiative required the evaluation, optimization, development, and publication of new methods and analytical tools spanning sample collection through variant analyses. Version-controlled protocols for each step of the process were developed and published on protocols.io. A custom data analysis tool and a publicly accessible dashboard were built to facilitate real-time visualization of the collected data, focusing on the relative abundance of SARS-CoV-2 variants and sub-lineages across different samples and sites throughout the project. From September 2021 through June 2023, a total of 3,389 wastewater samples were collected, with 2,517 undergoing sequencing and submission to NCBI under the umbrella BioProject, PRJNA757291. Sequence data were released with explicit quality control (QC) tags on all sequence records, communicating our confidence in the quality of data. Variant analysis revealed wide circulation of Delta in the fall of 2021 and captured the sweep of Omicron and subsequent diversification of this lineage through the end of the sampling period. This project successfully achieved two important goals for the FDA's GenomeTrakr program: first, contributing timely genomic data for the SARS-CoV-2 pandemic response, and second, establishing both capacity and best practices for culture-independent, population-level environmental surveillance for other pathogens of interest to the FDA. IMPORTANCE: This paper serves two primary objectives. First, it summarizes the genomic and contextual data collected during a Covid-19 pandemic response project, which utilized the FDA's laboratory network, traditionally employed for sequencing foodborne pathogens, for sequencing SARS-CoV-2 from wastewater samples. Second, it outlines best practices for gathering and organizing population-level next generation sequencing (NGS) data collected for culture-free, surveillance of pathogens sourced from environmental samples.


Assuntos
COVID-19 , SARS-CoV-2 , United States Food and Drug Administration , Águas Residuárias , SARS-CoV-2/genética , Estados Unidos/epidemiologia , Águas Residuárias/virologia , COVID-19/epidemiologia , COVID-19/transmissão , COVID-19/prevenção & controle , COVID-19/virologia , Humanos , Pandemias/prevenção & controle , Genoma Viral/genética , Vigilância Epidemiológica Baseada em Águas Residuárias
19.
Microb Genom ; 10(6)2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38860884

RESUMO

As public health laboratories expand their genomic sequencing and bioinformatics capacity for the surveillance of different pathogens, labs must carry out robust validation, training, and optimization of wet- and dry-lab procedures. Achieving these goals for algorithms, pipelines and instruments often requires that lower quality datasets be made available for analysis and comparison alongside those of higher quality. This range of data quality in reference sets can complicate the sharing of sub-optimal datasets that are vital for the community and for the reproducibility of assays. Sharing of useful, but sub-optimal datasets requires careful annotation and documentation of known issues to enable appropriate interpretation, avoid being mistaken for better quality information, and for these data (and their derivatives) to be easily identifiable in repositories. Unfortunately, there are currently no standardized attributes or mechanisms for tagging poor-quality datasets, or datasets generated for a specific purpose, to maximize their utility, searchability, accessibility and reuse. The Public Health Alliance for Genomic Epidemiology (PHA4GE) is an international community of scientists from public health, industry and academia focused on improving the reproducibility, interoperability, portability, and openness of public health bioinformatic software, skills, tools and data. To address the challenges of sharing lower quality datasets, PHA4GE has developed a set of standardized contextual data tags, namely fields and terms, that can be included in public repository submissions as a means of flagging pathogen sequence data with known quality issues, increasing their discoverability. The contextual data tags were developed through consultations with the community including input from the International Nucleotide Sequence Data Collaboration (INSDC), and have been standardized using ontologies - community-based resources for defining the tag properties and the relationships between them. The standardized tags are agnostic to the organism and the sequencing technique used and thus can be applied to data generated from any pathogen using an array of sequencing techniques. The tags can also be applied to synthetic (lab created) data. The list of standardized tags is maintained by PHA4GE and can be found at https://github.com/pha4ge/contextual_data_QC_tags. Definitions, ontology IDs, examples of use, as well as a JSON representation, are provided. The PHA4GE QC tags were tested, and are now implemented, by the FDA's GenomeTrakr laboratory network as part of its routine submission process for SARS-CoV-2 wastewater surveillance. We hope that these simple, standardized tags will help improve communication regarding quality control in public repositories, in addition to making datasets of variable quality more easily identifiable. Suggestions for additional tags can be submitted to PHA4GE via the New Term Request Form in the GitHub repository. By providing a mechanism for feedback and suggestions, we also expect that the tags will evolve with the needs of the community.


Assuntos
Biologia Computacional , Saúde Pública , Controle de Qualidade , Humanos , Biologia Computacional/métodos , Disseminação de Informação/métodos , Reprodutibilidade dos Testes , Anotação de Sequência Molecular/métodos , Genômica/métodos , Software
20.
PeerJ ; 11: e14596, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36721781

RESUMO

Background: The accurate identification of SARS-CoV-2 (SC2) variants and estimation of their abundance in mixed population samples (e.g., air or wastewater) is imperative for successful surveillance of community level trends. Assessing the performance of SC2 variant composition estimators (VCEs) should improve our confidence in public health decision making. Here, we introduce a linear regression based VCE and compare its performance to four other VCEs: two re-purposed DNA sequence read classifiers (Kallisto and Kraken2), a maximum-likelihood based method (Lineage deComposition for Sars-Cov-2 pooled samples (LCS)), and a regression based method (Freyja). Methods: We simulated DNA sequence datasets of known variant composition from both Illumina and Oxford Nanopore Technologies (ONT) platforms and assessed the performance of each VCE. We also evaluated VCEs performance using publicly available empirical wastewater samples collected for SC2 surveillance efforts. Bioinformatic analyses were performed with a custom NextFlow workflow (C-WAP, CFSAN Wastewater Analysis Pipeline). Relative root mean squared error (RRMSE) was used as a measure of performance with respect to the known abundance and concordance correlation coefficient (CCC) was used to measure agreement between pairs of estimators. Results: Based on our results from simulated data, Kallisto was the most accurate estimator as it had the lowest RRMSE, followed by Freyja. Kallisto and Freyja had the most similar predictions, reflected by the highest CCC metrics. We also found that accuracy was platform and amplicon panel dependent. For example, the accuracy of Freyja was significantly higher with Illumina data compared to ONT data; performance of Kallisto was best with ARTICv4. However, when analyzing empirical data there was poor agreement among methods and variations in the number of variants detected (e.g., Freyja ARTICv4 had a mean of 2.2 variants while Kallisto ARTICv4 had a mean of 10.1 variants). Conclusion: This work provides an understanding of the differences in performance of a number of VCEs and how accurate they are in capturing the relative abundance of SC2 variants within a mixed sample (e.g., wastewater). Such information should help officials gauge the confidence they can have in such data for informing public health decisions.


Assuntos
COVID-19 , Humanos , COVID-19/diagnóstico , Funções Verossimilhança , SARS-CoV-2/genética , Águas Residuárias
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA