Your browser doesn't support javascript.
loading
Use of Response Permutation to Measure an Imaging Dataset's Susceptibility to Overfitting by Selected Standard Analysis Pipelines.
Chakraborty, Jayasree; Midya, Abhishek; Kurland, Brenda F; Welch, Mattea L; Gonen, Mithat; Moskowitz, Chaya S; Simpson, Amber L.
Affiliation
  • Chakraborty J; Department of Surgery, Hepatopancreatobiliary Service, Memorial Sloan Kettering Cancer Center, New York, New York, USA.
  • Midya A; Department of Surgery, Hepatopancreatobiliary Service, Memorial Sloan Kettering Cancer Center, New York, New York, USA.
  • Kurland BF; GSK, Collegeville, Pennsylvania, USA.
  • Welch ML; Princess Margaret Data Science Program, University Health Network, Toronto, Ontario, Canada.
  • Gonen M; Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, New York, USA.
  • Moskowitz CS; Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, New York, USA.
  • Simpson AL; School of Computing, Department of Biomedical and Molecular Sciences, Queen's University, Kingston, Ontario, Canada. Electronic address: amber.simpson@queensu.ca.
Acad Radiol ; 2024 Apr 12.
Article de En | MEDLINE | ID: mdl-38614825
ABSTRACT
RATIONALE AND

OBJECTIVES:

This study demonstrates a method for quantifying the impact of overfitting on the receiving operator characteristic curve (AUC) when using standard analysis pipelines to develop imaging biomarkers. We illustrate the approach using two publicly available repositories of radiology and pathology images for breast cancer diagnosis. MATERIALS AND

METHODS:

For each dataset, we permuted the outcome (cancer diagnosis) values to eliminate any true association between imaging features and outcome. Seven types of classification models (logistic regression, linear discriminant analysis, Naïve Bayes, linear support vector machines, nonlinear support vector machine, random forest, and multi-layer perceptron) were fitted to each scrambled dataset and evaluated by each of four techniques (all data, hold-out, 10-fold cross-validation, and bootstrapping). After repeating this process for a total of 50 outcome permutations, we averaged the resulting AUCs. Any increase over a null AUC of 0.5 can be attributed to overfitting.

RESULTS:

Applying this approach and varying sample size and the number of imaging features, we found that failing to control for overfitting could result in near-perfect prediction (AUC near 1.0). Cross-validation offered greater protection against overfitting than the other evaluation techniques, and for most classification algorithms a sample size of at least 200 was required to assess as few as 10 features with less than 0.05 AUC inflation attributable to overfitting.

CONCLUSION:

This approach could be applied to any curated dataset to suggest the number of features and analysis approaches to limit overfitting.
Mots clés

Texte intégral: 1 Collection: 01-internacional Base de données: MEDLINE Langue: En Journal: Acad Radiol Sujet du journal: RADIOLOGIA Année: 2024 Type de document: Article Pays d'affiliation: États-Unis d'Amérique Pays de publication: États-Unis d'Amérique

Texte intégral: 1 Collection: 01-internacional Base de données: MEDLINE Langue: En Journal: Acad Radiol Sujet du journal: RADIOLOGIA Année: 2024 Type de document: Article Pays d'affiliation: États-Unis d'Amérique Pays de publication: États-Unis d'Amérique