On the evaluation of synthetic longitudinal electronic health records.

Achterberg, Jim L; Haas, Marcel R; Spruit, Marco R

Achterberg, Jim L; Haas, Marcel R; Spruit, Marco R.

Afiliação

Achterberg JL; Public Health and Primary Care, Health Campus The Hague, Leiden University Medical Center, Albinusdreef 2, Leiden, South-Holland, 2333ZA, Netherlands. j.l.achterberg@lumc.nl.
Haas MR; Public Health and Primary Care, Health Campus The Hague, Leiden University Medical Center, Albinusdreef 2, Leiden, South-Holland, 2333ZA, Netherlands.
Spruit MR; Public Health and Primary Care, Health Campus The Hague, Leiden University Medical Center, Albinusdreef 2, Leiden, South-Holland, 2333ZA, Netherlands.

BMC Med Res Methodol ; 24(1): 181, 2024 Aug 14.

Article em En | MEDLINE | ID: mdl-39143466

ABSTRACT

ABSTRACT

BACKGROUND:

Synthetic Electronic Health Records (EHRs) are becoming increasingly popular as a privacy enhancing technology. However, for longitudinal EHRs specifically, little research has been done into how to properly evaluate synthetically generated samples. In this article, we provide a discussion on existing methods and recommendations when evaluating the quality of synthetic longitudinal EHRs.

METHODS:

We recommend to assess synthetic EHR quality through similarity to real EHRs in low-dimensional projections, accuracy of a classifier discriminating synthetic from real samples, performance of synthetic versus real trained algorithms in clinical tasks, and privacy risk through risk of attribute inference. For each metric we discuss strengths and weaknesses, next to showing how it can be applied on a longitudinal dataset.

RESULTS:

To support the discussion on evaluation metrics, we apply discussed metrics on a dataset of synthetic EHRs generated from the Medical Information Mart for Intensive Care-IV (MIMIC-IV) repository.

CONCLUSIONS:

The discussion on evaluation metrics provide guidance for researchers on how to use and interpret different metrics when evaluating the quality of synthetic longitudinal EHRs.

Assuntos

Algoritmos; Registros Eletrônicos de Saúde; Registros Eletrônicos de Saúde/estatística & dados numéricos; Registros Eletrônicos de Saúde/normas; Humanos; Estudos Longitudinais; Privacidade

Palavras-chave

Electronic health records; Goodness-of-Fit; Longitudinal; Privacy risk; Synthetic data

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Algoritmos / Registros Eletrônicos de Saúde Limite: Humans Idioma: En Ano de publicação: 2024 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google