Error Consistency for Machine Learning Evaluation and Validation with Application to Biomedical Diagnostics.

Levman, Jacob; Ewenson, Bryan; Apaloo, Joe; Berger, Derek; Tyrrell, Pascal N

Levman, Jacob; Ewenson, Bryan; Apaloo, Joe; Berger, Derek; Tyrrell, Pascal N.

Afiliação

Levman J; Department of Computer Science, St. Francis Xavier University, Antigonish, NS B2G 2W5, Canada.
Ewenson B; Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Department of Radiology, Harvard Medical School, Boston, MA 02129, USA.
Apaloo J; Nova Scotia Health Authority, Halifax, NS B3H 1V7, Canada.
Berger D; Department of Computer Science, St. Francis Xavier University, Antigonish, NS B2G 2W5, Canada.
Tyrrell PN; Department of Mathematics and Statistics, St. Francis Xavier University, Antigonish, NS B2G 2W5, Canada.

Diagnostics (Basel) ; 13(7)2023 Apr 01.

Article em En | MEDLINE | ID: mdl-37046533

ABSTRACT

ABSTRACT

Supervised machine learning classification is the most common example of artificial intelligence (AI) in industry and in academic research. These technologies predict whether a series of measurements belong to one of multiple groups of examples on which the machine was previously trained. Prior to real-world deployment, all implementations need to be carefully evaluated with hold-out validation, where the algorithm is tested on different samples than it was provided for training, in order to ensure the generalizability and reliability of AI models. However, established methods for performing hold-out validation do not assess the consistency of the mistakes that the AI model makes during hold-out validation. Here, we show that in addition to standard methods, an enhanced technique for performing hold-out validation-that also assesses the consistency of the sample-wise mistakes made by the learning algorithm-can assist in the evaluation and design of reliable and predictable AI models. The technique can be applied to the validation of any supervised learning classification application, and we demonstrate the use of the technique on a variety of example biomedical diagnostic applications, which help illustrate the importance of producing reliable AI models. The validation software created is made publicly available, assisting anyone developing AI models for any supervised classification application in the creation of more reliable and predictable technologies.

Palavras-chave

classification; error consistency; supervised machine learning; validation

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Tipo de estudo: Diagnostic_studies / Prognostic_studies Idioma: En Revista: Diagnostics (Basel) Ano de publicação: 2023 Tipo de documento: Article País de afiliação: Canadá

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google