Rechercher | Portail Régional BVS

1.

PHA4GE quality control contextual data tags: standardized annotations for sharing public health sequence datasets with known quality issues to facilitate testing and training.

Griffiths, Emma J; Mendes, Inês; Maguire, Finlay; Guthrie, Jennifer L; Wee, Bryan A; Schmedes, Sarah; Holt, Kathryn; Yadav, Chanchal; Cameron, Rhiannon; Barclay, Charlotte; Dooley, Damion; MacCannell, Duncan; Chindelevitch, Leonid; Karsch-Mizrachi, Ilene; Waheed, Zahra; Katz, Lee; Petit Iii, Robert; Dave, Mugdha; Oluniyi, Paul; Nasar, Muhammad Ibtisam; Raphenya, Amogelang; Hsiao, William W L; Timme, Ruth E.

Microb Genom ; 10(6)2024 Jun.

Article de Anglais | MEDLINE | ID: mdl-38860884

RÉSUMÉ

As public health laboratories expand their genomic sequencing and bioinformatics capacity for the surveillance of different pathogens, labs must carry out robust validation, training, and optimization of wet- and dry-lab procedures. Achieving these goals for algorithms, pipelines and instruments often requires that lower quality datasets be made available for analysis and comparison alongside those of higher quality. This range of data quality in reference sets can complicate the sharing of sub-optimal datasets that are vital for the community and for the reproducibility of assays. Sharing of useful, but sub-optimal datasets requires careful annotation and documentation of known issues to enable appropriate interpretation, avoid being mistaken for better quality information, and for these data (and their derivatives) to be easily identifiable in repositories. Unfortunately, there are currently no standardized attributes or mechanisms for tagging poor-quality datasets, or datasets generated for a specific purpose, to maximize their utility, searchability, accessibility and reuse. The Public Health Alliance for Genomic Epidemiology (PHA4GE) is an international community of scientists from public health, industry and academia focused on improving the reproducibility, interoperability, portability, and openness of public health bioinformatic software, skills, tools and data. To address the challenges of sharing lower quality datasets, PHA4GE has developed a set of standardized contextual data tags, namely fields and terms, that can be included in public repository submissions as a means of flagging pathogen sequence data with known quality issues, increasing their discoverability. The contextual data tags were developed through consultations with the community including input from the International Nucleotide Sequence Data Collaboration (INSDC), and have been standardized using ontologies - community-based resources for defining the tag properties and the relationships between them. The standardized tags are agnostic to the organism and the sequencing technique used and thus can be applied to data generated from any pathogen using an array of sequencing techniques. The tags can also be applied to synthetic (lab created) data. The list of standardized tags is maintained by PHA4GE and can be found at https://github.com/pha4ge/contextual_data_QC_tags. Definitions, ontology IDs, examples of use, as well as a JSON representation, are provided. The PHA4GE QC tags were tested, and are now implemented, by the FDA's GenomeTrakr laboratory network as part of its routine submission process for SARS-CoV-2 wastewater surveillance. We hope that these simple, standardized tags will help improve communication regarding quality control in public repositories, in addition to making datasets of variable quality more easily identifiable. Suggestions for additional tags can be submitted to PHA4GE via the New Term Request Form in the GitHub repository. By providing a mechanism for feedback and suggestions, we also expect that the tags will evolve with the needs of the community.

Sujet(s)

Biologie informatique , Santé publique , Contrôle de qualité , Humains , Biologie informatique/méthodes , Diffusion de l'information/méthodes , Reproductibilité des résultats , Annotation de séquence moléculaire/méthodes , Génomique/méthodes , Logiciel

2.

SARS-CoV-2 wastewater variant surveillance: pandemic response leveraging FDA's GenomeTrakr network.

Timme, Ruth E; Woods, Jacquelina; Jones, Jessica L; Calci, Kevin R; Rodriguez, Rachel; Barnes, Candace; Leard, Elizabeth; Craven, Mark; Chen, Haifeng; Boerner, Cameron; Grim, Christopher; Windsor, Amanda M; Ramachandran, Padmini; Muruvanda, Tim; Rand, Hugh; Tesfaldet, Bereket; Amirzadegan, Jasmine; Kayikcioglu, Tunc; Walsky, Tamara; Allard, Marc; Balkey, Maria; Bias, C Hope; Brown, Eric; Judy, Kathryn; Pfefer, Tina; Tallent, Sandra M; Hoffmann, Maria; Pettengill, James.

mSystems ; 9(6): e0141523, 2024 Jun 18.

Article de Anglais | MEDLINE | ID: mdl-38819130

RÉSUMÉ

Wastewater surveillance has emerged as a crucial public health tool for population-level pathogen surveillance. Supported by funding from the American Rescue Plan Act of 2021, the FDA's genomic epidemiology program, GenomeTrakr, was leveraged to sequence SARS-CoV-2 from wastewater sites across the United States. This initiative required the evaluation, optimization, development, and publication of new methods and analytical tools spanning sample collection through variant analyses. Version-controlled protocols for each step of the process were developed and published on protocols.io. A custom data analysis tool and a publicly accessible dashboard were built to facilitate real-time visualization of the collected data, focusing on the relative abundance of SARS-CoV-2 variants and sub-lineages across different samples and sites throughout the project. From September 2021 through June 2023, a total of 3,389 wastewater samples were collected, with 2,517 undergoing sequencing and submission to NCBI under the umbrella BioProject, PRJNA757291. Sequence data were released with explicit quality control (QC) tags on all sequence records, communicating our confidence in the quality of data. Variant analysis revealed wide circulation of Delta in the fall of 2021 and captured the sweep of Omicron and subsequent diversification of this lineage through the end of the sampling period. This project successfully achieved two important goals for the FDA's GenomeTrakr program: first, contributing timely genomic data for the SARS-CoV-2 pandemic response, and second, establishing both capacity and best practices for culture-independent, population-level environmental surveillance for other pathogens of interest to the FDA. IMPORTANCE: This paper serves two primary objectives. First, it summarizes the genomic and contextual data collected during a Covid-19 pandemic response project, which utilized the FDA's laboratory network, traditionally employed for sequencing foodborne pathogens, for sequencing SARS-CoV-2 from wastewater samples. Second, it outlines best practices for gathering and organizing population-level next generation sequencing (NGS) data collected for culture-free, surveillance of pathogens sourced from environmental samples.

Sujet(s)

COVID-19 , SARS-CoV-2 , Food and Drug Administration (USA) , Eaux usées , SARS-CoV-2/génétique , États-Unis/épidémiologie , Eaux usées/virologie , COVID-19/épidémiologie , COVID-19/transmission , COVID-19/prévention et contrôle , COVID-19/virologie , Humains , Pandémies/prévention et contrôle , Génome viral/génétique , Surveillance épidémiologique fondée sur les eaux usées

3.

A One Health Perspective on Salmonella enterica Serovar Infantis, an Emerging Human Multidrug-Resistant Pathogen.

Mattock, Jennifer; Chattaway, Marie Anne; Hartman, Hassan; Dallman, Timothy J; Smith, Anthony M; Keddy, Karen; Petrovska, Liljana; Manners, Emma J; Duze, Sanelisiwe T; Smouse, Shannon; Tau, Nomsa; Timme, Ruth; Baker, Dave J; Mather, Alison E; Wain, John; Langridge, Gemma C.

Emerg Infect Dis ; 30(4): 701-710, 2024 Apr.

Article de Anglais | MEDLINE | ID: mdl-38526070

RÉSUMÉ

Salmonella enterica serovar Infantis presents an ever-increasing threat to public health because of its spread throughout many countries and association with high levels of antimicrobial resistance (AMR). We analyzed whole-genome sequences of 5,284 Salmonella Infantis strains from 74 countries, isolated during 1989-2020 from a wide variety of human, animal, and food sources, to compare genetic phylogeny, AMR determinants, and plasmid presence. The global Salmonella Infantis population structure diverged into 3 clusters: a North American cluster, a European cluster, and a global cluster. The levels of AMR varied by Salmonella Infantis cluster and by isolation source; 73% of poultry isolates were multidrug resistant, compared with 35% of human isolates. This finding correlated with the presence of the pESI megaplasmid; 71% of poultry isolates contained pESI, compared with 32% of human isolates. This study provides key information for public health teams engaged in reducing the spread of this pathogen.

Sujet(s)

Une seule santé , Salmonella enterica , Animaux , Humains , Sérogroupe , Antibactériens/pharmacologie , Salmonella/génétique , Volaille , Multirésistance bactérienne aux médicaments/génétique

4.

Editorial: Integration of NGS in clinical and public health microbiology workflows: applications, compliance, quality considerations.

Yang, Shangxin; Kozyreva, Varvara K; Timme, Ruth E; Hemarajata, Peera.

Front Public Health ; 12: 1357098, 2024.

Article de Anglais | MEDLINE | ID: mdl-38322128

Sujet(s)

Techniques microbiologiques , Santé publique , Flux de travaux , Séquençage nucléotidique à haut débit

5.

Putting everything in its place: using the INSDC compliant Pathogen Data Object Model to better structure genomic data submitted for public health applications.

Timme, Ruth E; Karsch-Mizrachi, Ilene; Waheed, Zahra; Arita, Masanori; MacCannell, Duncan; Maguire, Finlay; Petit Iii, Robert; Page, Andrew J; Mendes, Catarina Inês; Nasar, Muhammad Ibtisam; Oluniyi, Paul; Tyler, Andrea D; Raphenya, Amogelang R; Guthrie, Jennifer L; Olawoye, Idowu; Rinck, Gabriele; O'Cathail, Colman; Lees, John; Cochrane, Guy; Cummins, Carla; Brister, J Rodney; Klimke, William; Feldgarden, Michael; Griffiths, Emma.

Microb Genom ; 9(12)2023 Dec.

Article de Anglais | MEDLINE | ID: mdl-38085797

RÉSUMÉ

Fast, efficient public health actions require well-organized and coordinated systems that can supply timely and accurate knowledge. Public databases of pathogen genomic data, such as the International Nucleotide Sequence Database Collaboration (INSDC), have become essential tools for efficient public health decisions. However, these international resources began primarily for academic purposes, rather than for surveillance or interventions. Now, queries need to access not only the whole genomes of multiple pathogens but also make connections using robust contextual metadata to identify issues of public health relevance. Databases that over time developed a patchwork of submission formats and requirements need to be consistently organized and coordinated internationally to allow effective searches.To help resolve these issues, we propose a common pathogen data structure called the Pathogen Data Object Model (DOM) that will formalize the minimum pieces of sequence data and contextual data necessary for general public health uses, while recognizing that submitters will likely withhold a wide range of non-public contextual data. Further, we propose contributors use the Pathogen DOM for all pathogen submissions (bacterial, viral, fungal, and parasites), which will simplify data submissions and provide a consistent and transparent data structure for downstream data analyses. We also highlight how improved submission tools can support the Pathogen DOM, offering users additional easy-to-use methods to ensure this structure is followed.

Sujet(s)

Nucléotides , Santé publique , Séquence nucléotidique , Génomique/méthodes , Bases de données d'acides nucléiques

6.

Application of quasimetagenomics methods to define microbial diversity and subtype Listeria monocytogenes in dairy and seafood production facilities.

Kocurek, Brandon; Ramachandran, Padmini; Grim, Christopher J; Morin, Paul; Howard, Laura; Ottesen, Andrea; Timme, Ruth; Leonard, Susan R; Rand, Hugh; Strain, Errol; Tadesse, Daniel; Pettengill, James B; Lacher, David W; Mammel, Mark; Jarvis, Karen G.

Microbiol Spectr ; 11(6): e0148223, 2023 Dec 12.

Article de Anglais | MEDLINE | ID: mdl-37812012

RÉSUMÉ

IMPORTANCE: In developed countries, the human diet is predominated by food commodities, which have been manufactured, processed, and stored in a food production facility. Little is known about the application of metagenomic sequencing approaches for detecting foodborne pathogens, such as L. monocytogenes, and characterizing microbial diversity in food production ecosystems. In this work, we investigated the utility of 16S rRNA amplicon and quasimetagenomic sequencing for the taxonomic and phylogenetic classification of Listeria culture enrichments of environmental swabs collected from dairy and seafood production facilities. We demonstrated that single-nucleotide polymorphism (SNP) analyses of L. monocytogenes metagenome-assembled genomes (MAGs) from quasimetagenomic data sets can achieve similar resolution as culture isolate whole-genome sequencing. To further understand the impact of genome coverage on MAG SNP cluster resolution, an in silico downsampling approach was employed to reduce the percentage of target pathogen sequence reads, providing an initial estimate of required MAG coverage for subtyping resolution of L. monocytogenes.

Sujet(s)

Listeria monocytogenes , Humains , Listeria monocytogenes/génétique , Microbiologie alimentaire , Phylogenèse , ARN ribosomique 16S/génétique , Écosystème , Produits de la mer

7.

Enterobacterales draft genome sequences: 15 historical (1998-2004) and 30 contemporary (2015-2016) clinical isolates from Pakistan.

Crawford, Matthew A; Lascols, Christine; Lomonaco, Sara; Timme, Ruth E; Fisher, Debra J; Anderson, Kevin; Hodge, David R; Morse, Stephen A; Pillai, Segaran P; Sharma, Shashi K; Khan, Erum; Allard, Marc W; Hughes, Molly A.

Microbiol Resour Announc ; 12(9): e0016323, 2023 Sep 19.

Article de Anglais | MEDLINE | ID: mdl-37504519

RÉSUMÉ

The continued emergence and spread of antimicrobial resistance among pathogenic bacteria are ever-growing threats to health and economy. Here, we report the draft genomes for 45 Enterobacterales clinical isolates, including historical and contemporary drug-resistant organisms, obtained in Pakistan between 1998 and 2016: 5 Serratia, 3 Salmonella, 3 Enterobacter, and 34 Klebsiella.

8.

A Schema for Digitized Surface Swab Site Metadata in Open-Source DNA Sequence Databases.

Feng, Jingzhang; Daeschel, Devin; Dooley, Damion; Griffiths, Emma; Allard, Marc; Timme, Ruth; Chen, Yi; Snyder, Abigail B.

mSystems ; 8(2): e0128422, 2023 04 27.

Article de Anglais | MEDLINE | ID: mdl-36847566

RÉSUMÉ

Large, open-source DNA sequence databases have been generated, in part, through the collection of microbial pathogens by swabbing surfaces in built environments. Analyzing these data in aggregate through public health surveillance requires digitization of the complex, domain-specific metadata that are associated with the swab site locations. However, the swab site location information is currently collected in a single, free-text, "isolation source", field-promoting generation of poorly detailed descriptions with various word order, granularity, and linguistic errors, making automation difficult and reducing machine-actionability. We assessed 1,498 free-text swab site descriptions that were generated during routine foodborne pathogen surveillance. The lexicon of free-text metadata was evaluated to determine the informational facets and the quantity of unique terms used by data collectors. Open Biological Ontologies (OBO) Foundry libraries were used to develop hierarchical vocabularies that are connected with logical relationships to describe swab site locations. 5 informational facets that were described by 338 unique terms were identified via content analysis. Term hierarchy facets were developed, as were statements (called axioms) about how the entities within these five domains are related. The schema developed through this study has been integrated into a publicly available pathogen metadata standard, facilitating ongoing surveillance and investigations. The One Health Enteric Package was available at NCBI BioSample, beginning in 2022. The collective use of metadata standards increases the interoperability of DNA sequence databases and enables large-scale approaches to data sharing and artificial intelligence as well as big-data solutions to food safety. IMPORTANCE The regular analysis of whole-genome sequence data in collections such as NCBI's Pathogen Detection Database is used by many public health organizations to detect outbreaks of infectious disease. However, isolate metadata in these databases are often incomplete and of poor quality. These complex, raw metadata must often be reorganized and manually formatted for use in aggregate analyses. These processes are inefficient and time-consuming, increasing the interpretative labor needed by public health groups to extract actionable information. The future use of open genomic epidemiology networks will be supported through the development of an internationally applicable vocabulary system with which swab site locations can be described.

Sujet(s)

Maladies transmissibles , Bases de données d'acides nucléiques , Humains , Métadonnées , Intelligence artificielle , Génomique

9.

Performance of methods for SARS-CoV-2 variant detection and abundance estimation within mixed population samples.

Kayikcioglu, Tunc; Amirzadegan, Jasmine; Rand, Hugh; Tesfaldet, Bereket; Timme, Ruth E; Pettengill, James B.

PeerJ ; 11: e14596, 2023.

Article de Anglais | MEDLINE | ID: mdl-36721781

RÉSUMÉ

Background: The accurate identification of SARS-CoV-2 (SC2) variants and estimation of their abundance in mixed population samples (e.g., air or wastewater) is imperative for successful surveillance of community level trends. Assessing the performance of SC2 variant composition estimators (VCEs) should improve our confidence in public health decision making. Here, we introduce a linear regression based VCE and compare its performance to four other VCEs: two re-purposed DNA sequence read classifiers (Kallisto and Kraken2), a maximum-likelihood based method (Lineage deComposition for Sars-Cov-2 pooled samples (LCS)), and a regression based method (Freyja). Methods: We simulated DNA sequence datasets of known variant composition from both Illumina and Oxford Nanopore Technologies (ONT) platforms and assessed the performance of each VCE. We also evaluated VCEs performance using publicly available empirical wastewater samples collected for SC2 surveillance efforts. Bioinformatic analyses were performed with a custom NextFlow workflow (C-WAP, CFSAN Wastewater Analysis Pipeline). Relative root mean squared error (RRMSE) was used as a measure of performance with respect to the known abundance and concordance correlation coefficient (CCC) was used to measure agreement between pairs of estimators. Results: Based on our results from simulated data, Kallisto was the most accurate estimator as it had the lowest RRMSE, followed by Freyja. Kallisto and Freyja had the most similar predictions, reflected by the highest CCC metrics. We also found that accuracy was platform and amplicon panel dependent. For example, the accuracy of Freyja was significantly higher with Illumina data compared to ONT data; performance of Kallisto was best with ARTICv4. However, when analyzing empirical data there was poor agreement among methods and variations in the number of variants detected (e.g., Freyja ARTICv4 had a mean of 2.2 variants while Kallisto ARTICv4 had a mean of 10.1 variants). Conclusion: This work provides an understanding of the differences in performance of a number of VCEs and how accurate they are in capturing the relative abundance of SC2 variants within a mixed sample (e.g., wastewater). Such information should help officials gauge the confidence they can have in such data for informing public health decisions.

Sujet(s)

COVID-19 , Humains , COVID-19/diagnostic , Fonctions de vraisemblance , SARS-CoV-2/génétique , Eaux usées

10.

Use of Whole Genome Sequencing by the Federal Interagency Collaboration for Genomics for Food and Feed Safety in the United States.

Stevens, Eric L; Carleton, Heather A; Beal, Jennifer; Tillman, Glenn E; Lindsey, Rebecca L; Lauer, A C; Pightling, Arthur; Jarvis, Karen G; Ottesen, Andrea; Ramachandran, Padmini; Hintz, Leslie; Katz, Lee S; Folster, Jason P; Whichard, Jean M; Trees, Eija; Timme, Ruth E; McDERMOTT, Patrick; Wolpert, Beverly; Bazaco, Michael; Zhao, Shaohua; Lindley, Sabina; Bruce, Beau B; Griffin, Patricia M; Brown, Eric; Allard, Marc; Tallent, Sandra; Irvin, Kari; Hoffmann, Maria; Wise, Matt; Tauxe, Robert; Gerner-Smidt, Peter; Simmons, Mustafa; Kissler, Bonnie; Defibaugh-Chavez, Stephanie; Klimke, William; Agarwala, Richa; Lindsay, James; Cook, Kimberly; Austerman, Suelee Robbe; Goldman, David; McGARRY, Sherri; Hale, Kis Robertson; Dessai, Uday; Musser, Steven M; Braden, Chris.

J Food Prot ; 85(5): 755-772, 2022 05 01.

Article de Anglais | MEDLINE | ID: mdl-35259246

RÉSUMÉ

ABSTRACT: This multiagency report developed by the Interagency Collaboration for Genomics for Food and Feed Safety provides an overview of the use of and transition to whole genome sequencing (WGS) technology for detection and characterization of pathogens transmitted commonly by food and for identification of their sources. We describe foodborne pathogen analysis, investigation, and harmonization efforts among the following federal agencies: National Institutes of Health; Department of Health and Human Services, Centers for Disease Control and Prevention (CDC) and U.S. Food and Drug Administration (FDA); and the U.S. Department of Agriculture, Food Safety and Inspection Service, Agricultural Research Service, and Animal and Plant Health Inspection Service. We describe single nucleotide polymorphism, core-genome, and whole genome multilocus sequence typing data analysis methods as used in the PulseNet (CDC) and GenomeTrakr (FDA) networks, underscoring the complementary nature of the results for linking genetically related foodborne pathogens during outbreak investigations while allowing flexibility to meet the specific needs of Interagency Collaboration partners. We highlight how we apply WGS to pathogen characterization (virulence and antimicrobial resistance profiles) and source attribution efforts and increase transparency by making the sequences and other data publicly available through the National Center for Biotechnology Information. We also highlight the impact of current trends in the use of culture-independent diagnostic tests for human diagnostic testing on analytical approaches related to food safety and what is next for the use of WGS in the area of food safety.

Sujet(s)

Maladies d'origine alimentaire , Animaux , Épidémies de maladies/prévention et contrôle , Sécurité des aliments , Maladies d'origine alimentaire/épidémiologie , Maladies d'origine alimentaire/prévention et contrôle , Génomique , États-Unis , Séquençage du génome entier

11.

Multi-laboratory evaluation of the Illumina iSeq platform for whole genome sequencing of Salmonella, Escherichia coli and Listeria.

Mitchell, Patrick K; Wang, Leyi; Stanhope, Bryce J; Cronk, Brittany D; Anderson, Renee; Mohan, Shipra; Zhou, Lijuan; Sanchez, Susan; Bartlett, Paula; Maddox, Carol; DeShambo, Vanessa; Mani, Rinosh; Hengesbach, Lindsy M; Gresch, Sarah; Wright, Katie; Mor, Sunil; Zhang, Shuping; Shen, Zhenyu; Yan, Lifang; Mackey, Rebecca; Franklin-Guild, Rebecca; Zhang, Yan; Prarat, Melanie; Shiplett, Katherine; Ramachandran, Akhilesh; Narayanan, Sai; Sanders, Justin; Hunkapiller, Andree A; Lahmers, Kevin; Carbonello, Amanda A; Aulik, Nicole; Lim, Ailam; Cooper, Jennifer; Jones, Angelica; Guag, Jake; Nemser, Sarah M; Tyson, Gregory H; Timme, Ruth; Strain, Errol; Reimschuessel, Renate; Ceric, Olgica; Goodman, Laura B.

Microb Genom ; 8(2)2022 02.

Article de Anglais | MEDLINE | ID: mdl-35113783

RÉSUMÉ

There is a growing need for public health and veterinary laboratories to perform whole genome sequencing (WGS) for monitoring antimicrobial resistance (AMR) and protecting the safety of people and animals. With the availability of smaller and more affordable sequencing platforms coupled with well-defined bioinformatic protocols, the technological capability to incorporate this technique for real-time surveillance and genomic epidemiology has greatly expanded. There is a need, however, to ensure that data are of high quality. The goal of this study was to assess the utility of a small benchtop sequencing platform using a multi-laboratory verification approach. Thirteen laboratories were provided the same equipment, reagents, protocols and bacterial reference strains. The Illumina DNA Prep and Nextera XT library preparation kits were compared, and 2×150 bp iSeq i100 chemistry was used for sequencing. Analyses comparing the sequences produced from this study with closed genomes from the provided strains were performed using open-source programs. A detailed, step-by-step protocol is publicly available via protocols.io (https://www.protocols.io/view/iseq-bacterial-wgs-protocol-bij8kcrw). The throughput for this method is approximately 4-6 bacterial isolates per sequencing run (20-26 Mb total load). The Illumina DNA Prep library preparation kit produced high-quality assemblies and nearly complete AMR gene annotations. The Prep method produced more consistent coverage compared to XT, and when coverage benchmarks were met, nearly all AMR, virulence and subtyping gene targets were correctly identified. Because it reduces the technical and financial barriers to generating WGS data, the iSeq platform is a viable option for small laboratories interested in genomic surveillance of microbial pathogens.

Sujet(s)

Escherichia coli/génétique , Génome bactérien , Séquençage nucléotidique à haut débit/méthodes , Listeria/génétique , Salmonella/génétique , Séquençage du génome entier/méthodes , Animaux , Bactéries/génétique , ADN bactérien/génétique , Infections à Escherichia coli/microbiologie , Maladies d'origine alimentaire/microbiologie , Banque de gènes , Génomique , Laboratoires , Salmonelloses/microbiologie , Virulence/génétique

12.

Future-proofing and maximizing the utility of metadata: The PHA4GE SARS-CoV-2 contextual data specification package.

Griffiths, Emma J; Timme, Ruth E; Mendes, Catarina Inês; Page, Andrew J; Alikhan, Nabil-Fareed; Fornika, Dan; Maguire, Finlay; Campos, Josefina; Park, Daniel; Olawoye, Idowu B; Oluniyi, Paul E; Anderson, Dominique; Christoffels, Alan; da Silva, Anders Gonçalves; Cameron, Rhiannon; Dooley, Damion; Katz, Lee S; Black, Allison; Karsch-Mizrachi, Ilene; Barrett, Tanya; Johnston, Anjanette; Connor, Thomas R; Nicholls, Samuel M; Witney, Adam A; Tyson, Gregory H; Tausch, Simon H; Raphenya, Amogelang R; Alcock, Brian; Aanensen, David M; Hodcroft, Emma; Hsiao, William W L; Vasconcelos, Ana Tereza R; MacCannell, Duncan R.

Gigascience ; 112022 02 16.

Article de Anglais | MEDLINE | ID: mdl-35169842

RÉSUMÉ

BACKGROUND: The Public Health Alliance for Genomic Epidemiology (PHA4GE) (https://pha4ge.org) is a global coalition that is actively working to establish consensus standards, document and share best practices, improve the availability of critical bioinformatics tools and resources, and advocate for greater openness, interoperability, accessibility, and reproducibility in public health microbial bioinformatics. In the face of the current pandemic, PHA4GE has identified a need for a fit-for-purpose, open-source SARS-CoV-2 contextual data standard. RESULTS: As such, we have developed a SARS-CoV-2 contextual data specification package based on harmonizable, publicly available community standards. The specification can be implemented via a collection template, as well as an array of protocols and tools to support both the harmonization and submission of sequence data and contextual information to public biorepositories. CONCLUSIONS: Well-structured, rich contextual data add value, promote reuse, and enable aggregation and integration of disparate datasets. Adoption of the proposed standard and practices will better enable interoperability between datasets and systems, improve the consistency and utility of generated data, and ultimately facilitate novel insights and discoveries in SARS-CoV-2 and COVID-19. The package is now supported by the NCBI's BioSample database.

Sujet(s)

COVID-19 , SARS-CoV-2 , Génomique , Humains , Métadonnées , Santé publique , Reproductibilité des résultats

13.

Interpretative Labor and the Bane of Nonstandardized Metadata in Public Health Surveillance and Food Safety.

Pettengill, James B; Beal, Jennifer; Balkey, Maria; Allard, Marc; Rand, Hugh; Timme, Ruth.

Clin Infect Dis ; 73(8): 1537-1539, 2021 10 20.

Article de Anglais | MEDLINE | ID: mdl-34240118

RÉSUMÉ

Open-source DNA sequence databases have long been touted as beneficial to public health, including the facilitation of earlier detection and response to infectious disease outbreaks. Of critical importance to harnessing these benefits is the metadata that describe general and other domain-specific attributes (eg, collection location, isolate type) of a sample. Unlike the sequence data, metadata are often incomplete and lack adherence to an international standard. Here, we describe the problem posed by such variable and incomplete metadata in terms of interpretative labor costs (the time and energy necessary to make sense of the signal in the genetic data) and the impact such metadata have on foodborne outbreak detection and response. Improving the quality of sequence-associated metadata would allow for earlier detection of emerging food safety hazards and allow faster response to foodborne outbreaks.

Sujet(s)

Maladies d'origine alimentaire , Métadonnées , Épidémies de maladies , Sécurité des aliments , Maladies d'origine alimentaire/épidémiologie , Humains , Santé publique , Surveillance de la santé publique

14.

Salmonella Genomics in Public Health and Food Safety.

Brown, Eric W; Bell, Rebecca; Zhang, Guodong; Timme, Ruth; Zheng, Jie; Hammack, Thomas S; Allard, Marc W.

EcoSal Plus ; 9(2): eESP00082020, 2021 12 15.

Article de Anglais | MEDLINE | ID: mdl-34125583

RÉSUMÉ

The species Salmonella enterica comprises over 2,600 serovars, many of which are known to be intracellular pathogens of mammals, birds, and reptiles. It is now apparent that Salmonella is a highly adapted environmental microbe and can readily persist in a number of environmental niches, including water, soil, and various plant (including produce) species. Much of what is known about the evolution and diversity of nontyphoidal Salmonella serovars (NTS) in the environment is the result of the rise of the genomics era in enteric microbiology. There are over 340,000 Salmonella genomes available in public databases. This extraordinary breadth of genomic diversity now available for the species, coupled with widespread availability and affordability of whole-genome sequencing (WGS) instrumentation, has transformed the way in which we detect, differentiate, and characterize Salmonella enterica strains in a timely way. Not only have WGS data afforded a detailed and global examination of the molecular epidemiological movement of Salmonella from diverse environmental reservoirs into human and animal hosts, but they have also allowed considerable consolidation of the diagnostic effort required to test for various phenotypes important to the characterization of Salmonella. For example, drug resistance, serovar, virulence determinants, and other genome-based attributes can all be discerned using a genome sequence. Finally, genomic analysis, in conjunction with functional and phenotypic approaches, is beginning to provide new insights into the precise adaptive changes that permit persistence of NTS in so many diverse and challenging environmental niches.

Sujet(s)

Santé publique , Salmonella , Animaux , Sécurité des aliments , Génomique , Humains , Phylogenèse , Salmonella/génétique

15.

Correction for Vangay et al., "Microbiome Metadata Standards: Report of the National Microbiome Data Collaborative's Workshop and Follow-On Activities".

Vangay, Pajau; Burgin, Josephine; Johnston, Anjanette; Beck, Kristen L; Berrios, Daniel C; Blumberg, Kai; Canon, Shane; Chain, Patrick; Chandonia, John-Marc; Christianson, Danielle; Costes, Sylvain V; Damerow, Joan; Duncan, William D; Dundore-Arias, Jose Pablo; Fagnan, Kjiersten; Galazka, Jonathan M; Gibbons, Sean M; Hays, David; Hervey, Judson; Hu, Bin; Hurwitz, Bonnie L; Jaiswal, Pankaj; Joachimiak, Marcin P; Kinkel, Linda; Ladau, Joshua; Martin, Stanton L; McCue, Lee Ann; Miller, Kayd; Mouncey, Nigel; Mungall, Chris; Pafilis, Evangelos; Reddy, T B K; Richardson, Lorna; Roux, Simon; Schriml, Lynn M; Shaffer, Justin P; Sundaramurthi, Jagadish Chandrabose; Thompson, Luke R; Timme, Ruth E; Zheng, Jie; Wood-Charlson, Elisha M; Eloe-Fadrosh, Emiley A.

mSystems ; 6(3)2021 May 04.

Article de Anglais | MEDLINE | ID: mdl-33947809

16.

Erratum for Miller et al., "Phylogenetic and Biogeographic Patterns of Vibrio parahaemolyticus Strains from North America Inferred from Whole-Genome Sequence Data".

Miller, John J; Weimer, Bart C; Timme, Ruth; Lüdeke, Catharina H M; Pettengill, James B; Bandoy, Dj Darwin; Weis, Allison M; Kaufman, James; Huang, B Carol; Payne, Justin; Strain, Errol; Jones, Jessica L.

Appl Environ Microbiol ; 87(12): e0069321, 2021 May 26.

Article de Anglais | MEDLINE | ID: mdl-34037462

17.

Phylogeny of Salmonella enterica subspecies arizonae by whole-genome sequencing reveals high incidence of polyphyly and low phase 1 H antigen variability.

Shariat, Nikki W; Timme, Ruth E; Walters, Abigail T.

Microb Genom ; 7(2)2021 02.

Article de Anglais | MEDLINE | ID: mdl-33539276

RÉSUMÉ

Salmonella enterica subspecies arizonae is frequently associated with animal reservoirs, particularly reptiles, and can cause illness in some mammals, including humans. Using whole-genome sequencing data, core genome phylogenetic analyses were performed using 112 S. enterica subsp. arizonae isolates, representing 46 of 102 described serovars. Nearly one-third of these are polyphyletic, including two serovars that appear in four and five distinct evolutionary lineages. Subspecies arizonae has a monophasic H antigen. Among the 46 serovars investigated, only 8 phase 1 H antigens were identified, demonstrating high conservation for this antigen. Prophages and plasmids were found throughout this subspecies, including five novel prophages. Polyphyly was also reflected in prophage content, although some clade-specific enrichment for some phages was observed. IncFII(S) was the most frequent plasmid replicon identified and was found in a quarter of S. enterica subsp. arizonae genomes. Salmonella pathogenicity islands (SPIs) 1 and 2 are present across all Salmonella, including this subspecies, although effectors sipA, sptP and arvA in SPI-1 and sseG and ssaI in SPI-2 appear to be lost in this lineage. SPI-20, encoding a type VI secretion system, is exclusive to this subspecies and is well maintained in all genomes sampled. A number of fimbral operons were identified, including the sas operon that appears to be a synapomorphy for this subspecies, while others exhibited more clade-specific patterns. This work reveals evolutionary patterns in S. enterica subsp. arizonae that make this subspecies a unique lineage within this very diverse species.

Sujet(s)

Antigènes bactériens/génétique , Salmonella enterica/classification , Séquençage du génome entier/méthodes , Antigènes bactériens/immunologie , Fimbriae bactériens/génétique , Génome bactérien , Ilots génomiques , Séquençage nucléotidique à haut débit , Phylogenèse , Plasmides/génétique , Prophages/génétique , Salmonella enterica/génétique , Salmonella enterica/immunologie , Sérogroupe

18.

Microbiome Metadata Standards: Report of the National Microbiome Data Collaborative's Workshop and Follow-On Activities.

Vangay, Pajau; Burgin, Josephine; Johnston, Anjanette; Beck, Kristen L; Berrios, Daniel C; Blumberg, Kai; Canon, Shane; Chain, Patrick; Chandonia, John-Marc; Christianson, Danielle; Costes, Sylvain V; Damerow, Joan; Duncan, William D; Dundore-Arias, Jose Pablo; Fagnan, Kjiersten; Galazka, Jonathan M; Gibbons, Sean M; Hays, David; Hervey, Judson; Hu, Bin; Hurwitz, Bonnie L; Jaiswal, Pankaj; Joachimiak, Marcin P; Kinkel, Linda; Ladau, Joshua; Martin, Stanton L; McCue, Lee Ann; Miller, Kayd; Mouncey, Nigel; Mungall, Chris; Pafilis, Evangelos; Reddy, T B K; Richardson, Lorna; Roux, Simon; Schriml, Lynn M.; Shaffer, Justin P; Sundaramurthi, Jagadish Chandrabose; Thompson, Luke R; Timme, Ruth E; Zheng, Jie; Wood-Charlson, Elisha M; Eloe-Fadrosh, Emiley A.

mSystems ; 6(1)2021 02 23.

Article de Anglais | MEDLINE | ID: mdl-33622857

RÉSUMÉ

Microbiome samples are inherently defined by the environment in which they are found. Therefore, data that provide context and enable interpretation of measurements produced from biological samples, often referred to as metadata, are critical. Important contributions have been made in the development of community-driven metadata standards; however, these standards have not been uniformly embraced by the microbiome research community. To understand how these standards are being adopted, or the barriers to adoption, across research domains, institutions, and funding agencies, the National Microbiome Data Collaborative (NMDC) hosted a workshop in October 2019. This report provides a summary of discussions that took place throughout the workshop, as well as outcomes of the working groups initiated at the workshop.

19.

Investigating the Distribution of Strains of Erwinia amylovora and Streptomycin Resistance in Apple Orchards in New York Using Clustered Regularly Interspaced Short Palindromic Repeat Profiles: A 6-Year Follow-Up.

Wallis, Anna; Yannuzzi, Isabella M; Choi, Mei-Wah; Spafford, John; Fenn, Matthew; Ramachandran, Padmini; Timme, Ruth; Pettengill, James B; Cagle, Robin; Ottesen, Andrea; Cox, Kerik D.

Plant Dis ; 105(11): 3554-3563, 2021 Nov.

Article de Anglais | MEDLINE | ID: mdl-33599513

RÉSUMÉ

Fire blight, caused by the bacterium Erwinia amylovora, is one of the most important diseases of apple. The antibiotic streptomycin is routinely used in the commercial apple industries of New York (NY) and New England to manage the disease. In 2002 and again, from 2011 to 2014, outbreaks of streptomycin resistance (SmR) were reported and investigated in NY. Motivated by new grower reports of control failures, we conducted a follow-up investigation of the distribution of SmR and E. amylovora strains for major apple production regions of NY over the last 6 years (2015 to 2020). Characterization of clustered regularly interspaced short palindromic repeat (CRISPR) profiles revealed that a few "cosmopolitan" strains were widely prevalent across regions, whereas many other "resident" strains were confined to one location. In addition, we uncovered novel CRISPR profile diversity in all investigated regions. SmR E. amylovora was detected only in a small area spanning two counties from 2017 to 2020 and was always associated with one CRISPR profile (41:23:38), which matched the profile of SmR E. amylovora, discovered in 2002. This suggests the original SmR E. amylovora was never fully eradicated and went undetected because of several seasons of low disease pressure in this region. Investigation of several representative isolates under controlled greenhouse conditions indicated significant differences in aggressiveness on 'Gala' apples. Potential implications of strain differences include the propensity of strains to become distributed across wide geographic regions and associated resistance management practices. Results from this work will directly influence sustainable fire blight management recommendations for commercial apple industries in NY state and other regions.

Sujet(s)

Erwinia amylovora , Malus , Clustered regularly interspaced short palindromic repeats , Erwinia amylovora/génétique , Études de suivi , Malus/génétique , État de New York , Maladies des plantes , Streptomycine/pharmacologie

20.

Phylogenetic and Biogeographic Patterns of Vibrio parahaemolyticus Strains from North America Inferred from Whole-Genome Sequence Data.

Miller, John J; Weimer, Bart C; Timme, Ruth; Lüdeke, Catharina H M; Pettengill, James B; Bandoy, DJ Darwin; Weis, Allison M; Kaufman, James; Huang, B Carol; Payne, Justin; Strain, Errol; Jones, Jessica L.

Appl Environ Microbiol ; 87(3)2021 01 15.

Article de Anglais | MEDLINE | ID: mdl-33187991

RÉSUMÉ

Vibrio parahaemolyticus is the most common cause of seafood-borne illness reported in the United States. The draft genomes of 132 North American clinical and oyster V. parahaemolyticus isolates were sequenced to investigate their phylogenetic and biogeographic relationships. The majority of oyster isolate sequence types (STs) were from a single harvest location; however, four were identified from multiple locations. There was population structure along the Gulf and Atlantic Coasts of North America, with what seemed to be a hub of genetic variability along the Gulf Coast, with some of the same STs occurring along the Atlantic Coast and one shared between the coastal waters of the Gulf and those of Washington State. Phylogenetic analyses found nine well-supported clades. Two clades were composed of isolates from both clinical and oyster sources. Four were composed of isolates entirely from clinical sources, and three were entirely from oyster sources. Each single-source clade consisted of one ST. Some human isolates lack tdh, trh, and some type III secretion system (T3SS) genes, which are established virulence genes of V. parahaemolyticus Thus, these genes are not essential for pathogenicity. However, isolates in the monophyletic groups from clinical sources were enriched in several categories of genes compared to those from monophyletic groups of oyster isolates. These functional categories include cell signaling, transport, and metabolism. The identification of genes in these functional categories provides a basis for future in-depth pathogenicity investigations of V. parahaemolyticusIMPORTANCEVibrio parahaemolyticus is the most common cause of seafood-borne illness reported in the United States and is frequently associated with shellfish consumption. This study contributes to our knowledge of the biogeography and functional genomics of this species around North America. STs shared between the Gulf Coast and the Atlantic seaboard as well as Pacific waters suggest possible transport via oceanic currents or large shipping vessels. STs frequently isolated from humans but rarely, if ever, isolated from the environment are likely more competitive in the human gut than other STs. This could be due to additional functional capabilities in areas such as cell signaling, transport, and metabolism, which may give these isolates an advantage in novel nutrient-replete environments such as the human gut.

Sujet(s)

Vibrio parahaemolyticus/génétique , Animaux , Surveillance biologique , Gènes bactériens , Génome bactérien , Humains , Amérique du Nord , Ostreidae/microbiologie , Phylogenèse , Infections à Vibrio/microbiologie , Vibrio parahaemolyticus/isolement et purification , Virulence/génétique , Séquençage du génome entier

RÉSUMÉ

Sujet(s)

RÉSUMÉ

Sujet(s)

RÉSUMÉ

Sujet(s)

Sujet(s)

RÉSUMÉ

Sujet(s)

RÉSUMÉ

Sujet(s)

RÉSUMÉ

RÉSUMÉ

Sujet(s)

RÉSUMÉ

Sujet(s)

RÉSUMÉ

Sujet(s)

RÉSUMÉ

Sujet(s)

RÉSUMÉ

Sujet(s)

RÉSUMÉ

Sujet(s)

RÉSUMÉ

Sujet(s)

RÉSUMÉ

Sujet(s)

RÉSUMÉ

RÉSUMÉ

Sujet(s)

RÉSUMÉ

Sujet(s)

ENVOYER À:

SÉLECTION CITATIONS

DÉTAIL DE RECHERCHE