Implications of non-stationarity on predictive modeling using EHRs.

Jung, Kenneth; Shah, Nigam H

Jung, Kenneth; Shah, Nigam H.

Afiliação

Jung K; Program in Biomedical Informatics, Stanford University, Stanford, CA, United States. Electronic address: kjung@stanford.edu.
Shah NH; Center for Biomedical Informatics Research, Stanford University, Stanford, CA, United States.

J Biomed Inform ; 58: 168-174, 2015 Dec.

Article em En | MEDLINE | ID: mdl-26483171

RESUMO

The rapidly increasing volume of clinical information captured in Electronic Health Records (EHRs) has led to the application of increasingly sophisticated models for purposes such as disease subtype discovery and predictive modeling. However, increasing adoption of EHRs implies that in the near future, much of the data available for such purposes will be from a time period during which both the practice of medicine and the clinical use of EHRs are in flux due to historic changes in both technology and incentives. In this work, we explore the implications of this phenomenon, called non-stationarity, on predictive modeling. We focus on the problem of predicting delayed wound healing using data available in the EHR during the first week of care in outpatient wound care centers, using a large dataset covering over 150,000 individual wounds and 59,958 patients seen over a period of four years. We manipulate the degree of non-stationarity seen by the model development process by changing the way data is split into training and test sets. We demonstrate that non-stationarity can lead to quite different conclusions regarding the relative merits of different models with respect to predictive power and calibration of their posterior probabilities. Under the non-stationarity exhibited in this dataset, the performance advantage of complex methods such as stacking relative to the best simple classifier disappears. Ignoring non-stationarity can thus lead to sub-optimal model selection in this task.

Assuntos

Registros Eletrônicos de Saúde; Modelos Teóricos; Difusão de Inovações; Humanos; Cicatrização

Palavras-chave

Data mining; Machine learning; Predictive model; Prognostic model; Wound healing

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Eixos temáticos: Inovacao_tecnologica Base de dados: MEDLINE Assunto principal: Registros Eletrônicos de Saúde / Modelos Teóricos Tipo de estudo: Prognostic_studies / Risk_factors_studies / Sysrev_observational_studies Limite: Humans Idioma: En Ano de publicação: 2015 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google