A spline-based tool to assess and visualize the calibration of multiclass risk predictions.

Van Hoorde, K; Van Huffel, S; Timmerman, D; Bourne, T; Van Calster, B

Van Hoorde, K; Van Huffel, S; Timmerman, D; Bourne, T; Van Calster, B.

Affiliation

Van Hoorde K; KU Leuven, Department of Electrical Engineering (ESAT), STADIUS Center for Dynamical Systems, Signal Processing, and Data Analytics, Leuven, Belgium; KU Leuven, iMinds Medical Information Technologies, Leuven, Belgium.
Van Huffel S; KU Leuven, Department of Electrical Engineering (ESAT), STADIUS Center for Dynamical Systems, Signal Processing, and Data Analytics, Leuven, Belgium; KU Leuven, iMinds Medical Information Technologies, Leuven, Belgium.
Timmerman D; Department of Obstetrics & Gynecology, University Hospitals Leuven, Leuven, Belgium; KU Leuven, Department of Development & Regeneration, Leuven, Belgium.
Bourne T; Department of Obstetrics & Gynecology, University Hospitals Leuven, Leuven, Belgium; KU Leuven, Department of Development & Regeneration, Leuven, Belgium; Queen Charlotte's & Chelsea Hospital, Imperial College, Du Cane Road, London W12 0HS, UK.
Van Calster B; KU Leuven, Department of Development & Regeneration, Leuven, Belgium. Electronic address: ben.vancalster@med.kuleuven.be.

J Biomed Inform ; 54: 283-93, 2015 Apr.

Article in En | MEDLINE | ID: mdl-25579635

ABSTRACT

ABSTRACT

When validating risk models (or probabilistic classifiers), calibration is often overlooked. Calibration refers to the reliability of the predicted risks, i.e. whether the predicted risks correspond to observed probabilities. In medical applications this is important because treatment decisions often rely on the estimated risk of disease. The aim of this paper is to present generic tools to assess the calibration of multiclass risk models. We describe a calibration framework based on a vector spline multinomial logistic regression model. This framework can be used to generate calibration plots and calculate the estimated calibration index (ECI) to quantify lack of calibration. We illustrate these tools in relation to risk models used to characterize ovarian tumors. The outcome of the study is the surgical stage of the tumor when relevant and the final histological outcome, which is divided into five classes benign, borderline malignant, stage I, stage II-IV, and secondary metastatic cancer. The 5909 patients included in the study are randomly split into equally large training and test sets. We developed and tested models using the following algorithms logistic regression, support vector machines, k nearest neighbors, random forest, naive Bayes and nearest shrunken centroids. Multiclass calibration plots are interesting as an approach to visualizing the reliability of predicted risks. The ECI is a convenient tool for comparing models, but is less informative and interpretable than calibration plots. In our case study, logistic regression and random forest showed the highest degree of calibration, and the naive Bayes the lowest.

Subject(s)
Key words

Calibration; Logistic regression; Machine learning; Multiclass; Probability estimation; Risk models

Fulltext

Add to My VHL

XML

PubMed Links

Search on Google

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Models, Statistical / Risk Assessment / Decision Support Systems, Clinical Type of study: Etiology_studies / Prognostic_studies / Risk_factors_studies Limits: Adult / Aged / Female / Humans / Middle aged Language: En Journal: J Biomed Inform Journal subject: INFORMATICA MEDICA Year: 2015 Document type: Article Affiliation country:

Fulltext

Add to My VHL

XML

PubMed Links

Search on Google