Your browser doesn't support javascript.
loading
Exploring the predictive capability of advanced machine learning in identifying severe disease phenotype in Salmonella enterica.
Karanth, Shraddha; Tanui, Collins K; Meng, Jianghong; Pradhan, Abani K.
Afiliação
  • Karanth S; Department of Nutrition and Food Science, University of Maryland, College Park, MD 20742, USA.
  • Tanui CK; Department of Nutrition and Food Science, University of Maryland, College Park, MD 20742, USA; Center for Food Safety and Security Systems, University of Maryland, College Park, MD 20742, USA.
  • Meng J; Department of Nutrition and Food Science, University of Maryland, College Park, MD 20742, USA; Center for Food Safety and Security Systems, University of Maryland, College Park, MD 20742, USA; Joint Institute for Food Safety and Applied Nutrition, University of Maryland, College Park, MD 20742, USA.
  • Pradhan AK; Department of Nutrition and Food Science, University of Maryland, College Park, MD 20742, USA; Center for Food Safety and Security Systems, University of Maryland, College Park, MD 20742, USA. Electronic address: akp@umd.edu.
Food Res Int ; 151: 110817, 2022 01.
Article em En | MEDLINE | ID: mdl-34980422
The past few years have seen a significant increase in availability of whole genome sequencing information, allowing for its incorporation in predictive modeling for foodborne pathogens to account for inter- and intra-species differences in their virulence. However, this is hindered by the inability of traditional statistical methods to analyze such large amounts of data compared to the number of observations/isolates. In this study, we have explored the applicability of machine learning (ML) models to predict the disease outcome, while identifying features that exert a significant effect on the prediction. This study was conducted on Salmonella enterica, a major foodborne pathogen with considerable inter- and intra-serovar variation. WGS of isolates obtained from various sources (i.e., human, chicken, and swine) were used as input in four machine learning models (logistic regression with ridge, random forest, support vector machine, and AdaBoost) to classify isolates based on disease severity (extraintestinal vs. gastrointestinal) in the host. The predictive performances of all models were tested with and without Elastic Net regularization to combat dimensionality issues. Elastic Net-regularized logistic regression model showed the best area under the receiver operating characteristic curve (AUC-ROC; 0.86) and outcome prediction accuracy (0.76). Additionally, genes coding for transcriptional regulation, acidic, oxidative, and anaerobic stress response, and antibiotic resistance were found to be significant predictors of disease severity. These genes, which were significantly associated with each outcome, could possibly be input in amended, gene-expression-specific predictive models to estimate virulence pattern-specific effect of Salmonella and other foodborne pathogens on human health.
Assuntos
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Salmonella enterica Tipo de estudo: Prognostic_studies / Risk_factors_studies Limite: Animals Idioma: En Revista: Food Res Int Ano de publicação: 2022 Tipo de documento: Article País de afiliação: Estados Unidos

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Salmonella enterica Tipo de estudo: Prognostic_studies / Risk_factors_studies Limite: Animals Idioma: En Revista: Food Res Int Ano de publicação: 2022 Tipo de documento: Article País de afiliação: Estados Unidos