Your browser doesn't support javascript.
loading
Outcome class imbalance and rare events: An underappreciated complication for overdose risk prediction modeling.
Cartus, Abigail R; Samuels, Elizabeth A; Cerdá, Magdalena; Marshall, Brandon D L.
Afiliación
  • Cartus AR; Department of Epidemiology, Brown University School of Public Health, Providence, Rhode Island, USA.
  • Samuels EA; Department of Epidemiology, Brown University School of Public Health, Providence, Rhode Island, USA.
  • Cerdá M; Department of Emergency Medicine, Alpert Medical School of Brown University, Providence, Rhode Island, USA.
  • Marshall BDL; Division of Epidemiology, Department of Population Health, Center for Opioid Epidemiology and Policy, School of Medicine, New York University, New York, New York, USA.
Addiction ; 118(6): 1167-1176, 2023 06.
Article en En | MEDLINE | ID: mdl-36683137
ABSTRACT
BACKGROUND AND

AIMS:

Low outcome prevalence, often observed with opioid-related outcomes, poses an underappreciated challenge to accurate predictive modeling. Outcome class imbalance, where non-events (i.e. negative class observations) outnumber events (i.e. positive class observations) by a moderate to extreme degree, can distort measures of predictive accuracy in misleading ways, and make the overall predictive accuracy and the discriminatory ability of a predictive model appear spuriously high. We conducted a simulation study to measure the impact of outcome class imbalance on predictive performance of a simple SuperLearner ensemble model and suggest strategies for reducing that impact. DESIGN, SETTING,

PARTICIPANTS:

Using a Monte Carlo design with 250 repetitions, we trained and evaluated these models on four simulated data sets with 100 000 observations each one with perfect balance between events and non-events, and three where non-events outnumbered events by an approximate factor of 101, 1001, and 10001, respectively. MEASUREMENTS We evaluated the performance of these models using a comprehensive suite of measures, including measures that are more appropriate for imbalanced data.

FINDINGS:

Increasing imbalance tended to spuriously improve overall accuracy (using a high threshold to classify events vs non-events, overall accuracy improved from 0.45 with perfect balance to 0.99 with the most severe outcome class imbalance), but diminished predictive performance was evident using other metrics (corresponding positive predictive value decreased from 0.99 to 0.14).

CONCLUSION:

Increasing reliance on algorithmic risk scores in consequential decision-making processes raises critical fairness and ethical concerns. This paper provides broad guidance for analytic strategies that clinical investigators can use to remedy the impacts of outcome class imbalance on risk prediction tools.
Asunto(s)
Palabras clave

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Sobredosis de Droga Tipo de estudio: Etiology_studies / Guideline / Prognostic_studies / Risk_factors_studies Aspecto: Ethics Límite: Humans Idioma: En Revista: Addiction Asunto de la revista: TRANSTORNOS RELACIONADOS COM SUBSTANCIAS Año: 2023 Tipo del documento: Article País de afiliación: Estados Unidos

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Sobredosis de Droga Tipo de estudio: Etiology_studies / Guideline / Prognostic_studies / Risk_factors_studies Aspecto: Ethics Límite: Humans Idioma: En Revista: Addiction Asunto de la revista: TRANSTORNOS RELACIONADOS COM SUBSTANCIAS Año: 2023 Tipo del documento: Article País de afiliación: Estados Unidos