Framework for Testing Robustness of Machine Learning-Based Classifiers.

Chuah, Joshua; Kruger, Uwe; Wang, Ge; Yan, Pingkun; Hahn, Juergen

Chuah, Joshua; Kruger, Uwe; Wang, Ge; Yan, Pingkun; Hahn, Juergen.

Afiliação

Chuah J; Department of Biomedical Engineering, Rensselaer Polytechnic Institute, Troy, NY 12180, USA.
Kruger U; Center for Biotechnology and Interdisciplinary Studies, Rensselaer Polytechnic Institute, Troy, NY 12180, USA.
Wang G; Department of Biomedical Engineering, Rensselaer Polytechnic Institute, Troy, NY 12180, USA.
Yan P; Department of Biomedical Engineering, Rensselaer Polytechnic Institute, Troy, NY 12180, USA.
Hahn J; Center for Biotechnology and Interdisciplinary Studies, Rensselaer Polytechnic Institute, Troy, NY 12180, USA.

J Pers Med ; 12(8)2022 Aug 14.

Article em En | MEDLINE | ID: mdl-36013263

RESUMO

There has been a rapid increase in the number of artificial intelligence (AI)/machine learning (ML)-based biomarker diagnostic classifiers in recent years. However, relatively little work has focused on assessing the robustness of these biomarkers, i.e., investigating the uncertainty of the AI/ML models that these biomarkers are based upon. This paper addresses this issue by proposing a framework to evaluate the already-developed classifiers with regard to their robustness by focusing on the variability of the classifiers' performance and changes in the classifiers' parameter values using factor analysis and Monte Carlo simulations. Specifically, this work evaluates (1) the importance of a classifier's input features and (2) the variability of a classifier's output and model parameter values in response to data perturbations. Additionally, it was found that one can estimate a priori how much replacement noise a classifier can tolerate while still meeting accuracy goals. To illustrate the evaluation framework, six different AI/ML-based biomarkers are developed using commonly used techniques (linear discriminant analysis, support vector machines, random forest, partial-least squares discriminant analysis, logistic regression, and multilayer perceptron) for a metabolomics dataset involving 24 measured metabolites taken from 159 study participants. The framework was able to correctly predict which of the classifiers should be less robust than others without recomputing the classifiers itself, and this prediction was then validated in a detailed analysis.

Palavras-chave

algorithms; artificial intelligence; biomarker; classification; machine learning; omics analysis

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Tipo de estudo: Prognostic_studies Idioma: En Revista: J Pers Med Ano de publicação: 2022 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google