Exploring the predictive capability of advanced machine learning in identifying severe disease phenotype in Salmonella enterica.
Food Res Int
; 151: 110817, 2022 01.
Article
em En
| MEDLINE
| ID: mdl-34980422
The past few years have seen a significant increase in availability of whole genome sequencing information, allowing for its incorporation in predictive modeling for foodborne pathogens to account for inter- and intra-species differences in their virulence. However, this is hindered by the inability of traditional statistical methods to analyze such large amounts of data compared to the number of observations/isolates. In this study, we have explored the applicability of machine learning (ML) models to predict the disease outcome, while identifying features that exert a significant effect on the prediction. This study was conducted on Salmonella enterica, a major foodborne pathogen with considerable inter- and intra-serovar variation. WGS of isolates obtained from various sources (i.e., human, chicken, and swine) were used as input in four machine learning models (logistic regression with ridge, random forest, support vector machine, and AdaBoost) to classify isolates based on disease severity (extraintestinal vs. gastrointestinal) in the host. The predictive performances of all models were tested with and without Elastic Net regularization to combat dimensionality issues. Elastic Net-regularized logistic regression model showed the best area under the receiver operating characteristic curve (AUC-ROC; 0.86) and outcome prediction accuracy (0.76). Additionally, genes coding for transcriptional regulation, acidic, oxidative, and anaerobic stress response, and antibiotic resistance were found to be significant predictors of disease severity. These genes, which were significantly associated with each outcome, could possibly be input in amended, gene-expression-specific predictive models to estimate virulence pattern-specific effect of Salmonella and other foodborne pathogens on human health.
Palavras-chave
Texto completo:
1
Coleções:
01-internacional
Base de dados:
MEDLINE
Assunto principal:
Salmonella enterica
Tipo de estudo:
Prognostic_studies
/
Risk_factors_studies
Limite:
Animals
Idioma:
En
Revista:
Food Res Int
Ano de publicação:
2022
Tipo de documento:
Article
País de afiliação:
Estados Unidos