Comparing genome versus proteome-based identification of clinical bacterial isolates.
Brief Bioinform
; 19(3): 495-505, 2018 05 01.
Article
em En
| MEDLINE
| ID: mdl-28013236
Whole-genome sequencing (WGS) is gaining importance in the analysis of bacterial cultures derived from patients with infectious diseases. Existing computational tools for WGS-based identification have, however, been evaluated on previously defined data relying thereby unwarily on the available taxonomic information.Here, we newly sequenced 846 clinical gram-negative bacterial isolates representing multiple distinct genera and compared the performance of five tools (CLARK, Kaiju, Kraken, DIAMOND/MEGAN and TUIT). To establish a faithful 'gold standard', the expert-driven taxonomy was compared with identifications based on matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry (MS) analysis. Additionally, the tools were also evaluated using a data set of 200 Staphylococcus aureus isolates.CLARK and Kraken (with k =31) performed best with 626 (100%) and 193 (99.5%) correct species classifications for the gram-negative and S. aureus isolates, respectively. Moreover, CLARK and Kraken demonstrated highest mean F-measure values (85.5/87.9% and 94.4/94.7% for the two data sets, respectively) in comparison with DIAMOND/MEGAN (71 and 85.3%), Kaiju (41.8 and 18.9%) and TUIT (34.5 and 86.5%). Finally, CLARK, Kaiju and Kraken outperformed the other tools by a factor of 30 to 170 fold in terms of runtime.We conclude that the application of nucleotide-based tools using k-mers-e.g. CLARK or Kraken-allows for accurate and fast taxonomic characterization of bacterial isolates from WGS data. Hence, our results suggest WGS-based genotyping to be a promising alternative to the MS-based biotyping in clinical settings. Moreover, we suggest that complementary information should be used for the evaluation of taxonomic classification tools, as public databases may suffer from suboptimal annotations.
Texto completo:
1
Coleções:
01-internacional
Base de dados:
MEDLINE
Assunto principal:
Proteínas de Bactérias
/
Genoma Bacteriano
/
Proteoma
/
Sequenciamento Completo do Genoma
/
Bactérias Gram-Negativas
Tipo de estudo:
Diagnostic_studies
/
Prognostic_studies
Limite:
Humans
Idioma:
En
Revista:
Brief Bioinform
Ano de publicação:
2018
Tipo de documento:
Article