Your browser doesn't support javascript.
loading
Automatic generation of case-detection algorithms to identify children with asthma from large electronic health record databases.
Afzal, Zubair; Engelkes, Marjolein; Verhamme, Katia M C; Janssens, Hettie M; Sturkenboom, Miriam C J M; Kors, Jan A; Schuemie, Martijn J.
Affiliation
  • Afzal Z; Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands. m.afzal@erasmusmc.nl
Pharmacoepidemiol Drug Saf ; 22(8): 826-33, 2013 Aug.
Article in En | MEDLINE | ID: mdl-23592573
ABSTRACT

PURPOSE:

Most electronic health record databases contain unstructured free-text narratives, which cannot be easily analyzed. Case-detection algorithms are usually created manually and often rely only on using coded information such as International Classification of Diseases version 9 codes. We applied a machine-learning approach to generate and evaluate an automated case-detection algorithm that uses both free-text and coded information to identify asthma cases.

METHODS:

The Integrated Primary Care Information (IPCI) database was searched for potential asthma patients aged 5-18 years using a broad query on asthma-related codes, drugs, and free text. A training set of 5032 patients was created by manually annotating the potential patients as definite, probable, or doubtful asthma cases or non-asthma cases. The rule-learning program RIPPER was then used to generate algorithms to distinguish cases from non-cases. An over-sampling method was used to balance the performance of the automated algorithm to meet our study requirements. Performance of the automated algorithm was evaluated against the manually annotated set.

RESULTS:

The selected algorithm yielded a positive predictive value (PPV) of 0.66, sensitivity of 0.98, and specificity of 0.95 when identifying only definite asthma cases; a PPV of 0.82, sensitivity of 0.96, and specificity of 0.90 when identifying both definite and probable asthma cases; and a PPV of 0.57, sensitivity of 0.95, and specificity of 0.67 for the scenario identifying definite, probable, and doubtful asthma cases.

CONCLUSIONS:

The automated algorithm shows good performance in detecting cases of asthma utilizing both free-text and coded data. This algorithm will facilitate large-scale studies of asthma in the IPCI database.
Subject(s)
Key words

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Asthma / Algorithms / Electronic Health Records Type of study: Diagnostic_studies / Prognostic_studies Limits: Adolescent / Child / Child, preschool / Humans Language: En Journal: Pharmacoepidemiol Drug Saf Journal subject: EPIDEMIOLOGIA / TERAPIA POR MEDICAMENTOS Year: 2013 Document type: Article Affiliation country: Netherlands

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Asthma / Algorithms / Electronic Health Records Type of study: Diagnostic_studies / Prognostic_studies Limits: Adolescent / Child / Child, preschool / Humans Language: En Journal: Pharmacoepidemiol Drug Saf Journal subject: EPIDEMIOLOGIA / TERAPIA POR MEDICAMENTOS Year: 2013 Document type: Article Affiliation country: Netherlands