Using Iterative Pairwise External Validation to Contextualize Prediction Model Performance: A Use Case Predicting 1-Year Heart Failure Risk in Patients with Diabetes Across Five Data Sources.
Drug Saf
; 45(5): 563-570, 2022 05.
Article
in En
| MEDLINE
| ID: mdl-35579818
INTRODUCTION: External validation of prediction models is increasingly being seen as a minimum requirement for acceptance in clinical practice. However, the lack of interoperability of healthcare databases has been the biggest barrier to this occurring on a large scale. Recent improvements in database interoperability enable a standardized analytical framework for model development and external validation. External validation of a model in a new database lacks context, whereby the external validation can be compared with a benchmark in this database. Iterative pairwise external validation (IPEV) is a framework that uses a rotating model development and validation approach to contextualize the assessment of performance across a network of databases. As a use case, we predicted 1-year risk of heart failure in patients with type 2 diabetes mellitus. METHODS: The method follows a two-step process involving (1) development of baseline and data-driven models in each database according to best practices and (2) validation of these models across the remaining databases. We introduce a heatmap visualization that supports the assessment of the internal and external model performance in all available databases. As a use case, we developed and validated models to predict 1-year risk of heart failure in patients initializing a second pharmacological intervention for type 2 diabetes mellitus. We leveraged the power of the Observational Medical Outcomes Partnership common data model to create an open-source software package to increase the consistency, speed, and transparency of this process. RESULTS: A total of 403,187 patients from five databases were included in the study. We developed five models that, when assessed internally, had a discriminative performance ranging from 0.73 to 0.81 area under the receiver operating characteristic curve with acceptable calibration. When we externally validated these models in a new database, three models achieved consistent performance and in context often performed similarly to models developed in the database itself. The visualization of IPEV provided valuable insights. From this, we identified the model developed in the Commercial Claims and Encounters (CCAE) database as the best performing model overall. CONCLUSION: Using IPEV lends weight to the model development process. The rotation of development through multiple databases provides context to model assessment, leading to improved understanding of transportability and generalizability. The inclusion of a baseline model in all modelling steps provides further context to the performance gains of increasing model complexity. The CCAE model was identified as a candidate for clinical use. The use case demonstrates that IPEV provides a huge opportunity in a new era of standardised data and analytics to improve insight into and trust in prediction models at an unprecedented scale.
Full text:
1
Collection:
01-internacional
Database:
MEDLINE
Main subject:
Diabetes Mellitus, Type 2
/
Heart Failure
Type of study:
Etiology_studies
/
Guideline
/
Prognostic_studies
/
Risk_factors_studies
Limits:
Humans
Language:
En
Journal:
Drug Saf
Journal subject:
TERAPIA POR MEDICAMENTOS
/
TOXICOLOGIA
Year:
2022
Document type:
Article
Affiliation country:
Country of publication: