Búsqueda | Portal de Búsqueda de la BVS España

Comprehensive assessment of the quality of Salmonella whole genome sequence data available in public sequence databases using the Salmonella in silico Typing Resource (SISTR).

Robertson, James; Yoshida, Catherine; Kruczkiewicz, Peter; Nadon, Celine; Nichani, Anil; Taboada, Eduardo N; Nash, John Howard Eagles.

Microb Genom ; 4(2)2018 02.

Artículo en Inglés | MEDLINE | ID: mdl-29338812

RESUMEN

Public health and food safety institutions around the world are adopting whole genome sequencing (WGS) to replace conventional methods for characterizing Salmonella for use in surveillance and outbreak response. Falling costs and increased throughput of WGS have resulted in an explosion of data, but questions remain as to the reliability and robustness of the data. Due to the critical importance of serovar information to public health, it is essential to have reliable serovar assignments available for all of the Salmonella records. The current study used a systematic assessment and curation of all Salmonella in the sequence read archive (SRA) to assess the state of the data and their utility. A total of 67â758 genomes were assembled de novo and quality-assessed for their assembly metrics as well as species and serovar assignments. A total of 42â400 genomes passed all of the quality criteria but 30.16â% of genomes were deposited without serotype information. These data were used to compare the concordance of reported and predicted serovars for two in silico prediction tools, multi-locus sequence typing (MLST) and the Salmonella in silico Typing Resource (SISTR), which produced predictions that were fully concordant with 87.51 and 91.91â% of the tested isolates, respectively. Concordance of in silico predictions increased when serovar variants were grouped together, 89.25â% for MLST and 94.98â% for SISTR. This study represents the first large-scale validation of serovar information in public genomes and provides a large validated set of genomes, which can be used to benchmark new bioinformatics tools.

Asunto(s)

Técnicas de Tipificación Bacteriana/métodos , Bases de Datos de Ácidos Nucleicos , Salmonella/genética , Secuenciación Completa del Genoma/métodos , Simulación por Computador , ADN Bacteriano/genética , Genoma Bacteriano , Tipificación de Secuencias Multilocus , Salud Pública , Reproducibilidad de los Resultados , Salmonella/clasificación , Infecciones por Salmonella/microbiología , Salmonella enterica , Análisis de Secuencia , Serogrupo , Serotipificación

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA