Your browser doesn't support javascript.
loading
minMLST: machine learning for optimization of bacterial strain typing.
Cohen, Shani; Rokach, Lior; Motro, Yair; Moran-Gilad, Jacob; Veksler-Lublinsky, Isana.
Afiliação
  • Cohen S; Department of Software and Information Systems Engineering, Ben Gurion University of the Negev, Beer Sheva 8410501, Israel.
  • Rokach L; Department of Software and Information Systems Engineering, Ben Gurion University of the Negev, Beer Sheva 8410501, Israel.
  • Motro Y; Department of Health Systems Management, Ben Gurion University of the Negev, Beer Sheva 8410501, Israel.
  • Moran-Gilad J; Department of Health Systems Management, Ben Gurion University of the Negev, Beer Sheva 8410501, Israel.
  • Veksler-Lublinsky I; Department of Software and Information Systems Engineering, Ben Gurion University of the Negev, Beer Sheva 8410501, Israel.
Bioinformatics ; 37(3): 303-311, 2021 04 20.
Article em En | MEDLINE | ID: mdl-32804993
MOTIVATION: High-resolution microbial strain typing is essential for various clinical purposes, including disease outbreak investigation, tracking of microbial transmission events and epidemiological surveillance of bacterial infections. The widely used approach for multilocus sequence typing (MLST) that is based on the core genome, cgMLST, has the advantage of a high level of typeability and maximal discriminatory power. Yet, the transition from a seven loci-based scheme to cgMLST involves several challenges, that include the need by some users to maintain backward compatibility, growing difficulties in the day-to-day communication within the microbiology community with respect to nomenclature and ontology, issues with typeability, especially if a more stringent approach to loci presence is used, and computational requirements concerning laboratory data management and sharing with end-users. Hence, methods for optimizing cgMLST schemes through careful reduction of the number of loci are expected to be beneficial for practical needs in different settings. RESULTS: We present a new machine learning-based methodology, minMLST, for minimizing the number of genes in cgMLST schemes by identifying subsets of informative genes and analyzing the trade-off between gene reduction and typing performance. The results achieved with minMLST over eight bacterial species show that despite the reduction in the number of genes up to a factor of 10, the typing performance remains very high and significant with an Adjusted Rand Index that ranges between 0.4 and 0.93 in different species and a P-value < 10-3. The identification of such optimized MLST schemes for bacterial strain typing is expected to improve the implementation of cgMLST by improving interlaboratory agreement and communication. AVAILABILITY AND IMPLEMENTATION: The python package minMLST is available at https://PyPi.org/project/minmlst/PyPI and supported on Linux and Windows. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Surtos de Doenças / Genoma Bacteriano Tipo de estudo: Prognostic_studies Idioma: En Ano de publicação: 2021 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Surtos de Doenças / Genoma Bacteriano Tipo de estudo: Prognostic_studies Idioma: En Ano de publicação: 2021 Tipo de documento: Article