Practical use case of natural language processing for observational clinical research data retrieval from electronic health records: AssistMED project.
Pol Arch Intern Med
; 134(5)2024 05 28.
Article
em En
| MEDLINE
| ID: mdl-38501989
ABSTRACT
INTRODUCTION:
Electronic health records (EHRs) contain data valuable for clinical research. However, they are in textual format and require manual encoding to databases, which is a lengthy and costly process. Natural language processing (NLP) is a computational technique that allows for text analysis.OBJECTIVES:
Our study aimed to demonstrate a practical use case of NLP for a large retrospective study cohort characterization and comparison with human retrieval. PATIENTS ANDMETHODS:
Anonymized discharge documentation of 10 314 patients from a cardiology tertiary care department was analyzed for inclusion in the CRAFT registry (Multicenter Experience in Atrial Fibrillation Patients Treated with Oral Anticoagulants; NCT02987062). Extensive clinical characteristics regarding concomitant diseases, medications, daily drug dosages, and echocardiography were collected manually and through NLP.RESULTS:
There were 3030 and 3029 patients identified by human and NLPbased approaches, respectively, reflecting 99.93% accuracy of NLP in detecting AF. Comprehensive baseline patient characteristics by NLP was faster than human analysis (3 h and 15 min vs 71 h and 12 min). The calculated CHA2DS2VASc and HASBLED scores based on both methods did not differ (human vs NLP; median [interquartile range], 3 [2-5] vs 3 [2-5]; P = 0.74 and 1 [1-2] vs 1 [1-2]; P = 0.63, respectively). For most data, an almost perfect agreement between NLP- and human-retrieved characteristics was found; daily dosage identification was the least accurate NLP feature. Similar conclusions on cohort characteristics would be made; however, daily dosage detection for some drug groups would require additional human validation in the NLPbased cohort.CONCLUSIONS:
NLP utilization in EHRs may accelerate data acquisition and provide accurate information for retrospective studies.
Texto completo:
1
Coleções:
01-internacional
Base de dados:
MEDLINE
Assunto principal:
Fibrilação Atrial
/
Processamento de Linguagem Natural
/
Registros Eletrônicos de Saúde
Limite:
Aged
/
Female
/
Humans
/
Male
/
Middle aged
Idioma:
En
Revista:
Pol Arch Intern Med
Ano de publicação:
2024
Tipo de documento:
Article
País de afiliação:
Polônia