Extracting seizure control metrics from clinic notes of patients with epilepsy: A natural language processing approach.

Fernandes, Marta; Cardall, Aidan; Moura, Lidia Mvr; McGraw, Christopher; Zafar, Sahar F; Westover, M Brandon

Fernandes, Marta; Cardall, Aidan; Moura, Lidia Mvr; McGraw, Christopher; Zafar, Sahar F; Westover, M Brandon.

Afiliación

Fernandes M; Department of Neurology, Massachusetts General Hospital (MGH), Boston, MA, United States; Harvard Medical School, Boston, MA, United States. Electronic address: mbentofernandes@mgh.harvard.edu.
Cardall A; Department of Neurology, Massachusetts General Hospital (MGH), Boston, MA, United States; Harvard Medical School, Boston, MA, United States.
Moura LM; Department of Neurology, Massachusetts General Hospital (MGH), Boston, MA, United States; Harvard Medical School, Boston, MA, United States.
McGraw C; Department of Neurology, Massachusetts General Hospital (MGH), Boston, MA, United States; Harvard Medical School, Boston, MA, United States.
Zafar SF; Department of Neurology, Massachusetts General Hospital (MGH), Boston, MA, United States; Harvard Medical School, Boston, MA, United States.
Westover MB; Harvard Medical School, Boston, MA, United States; Beth Israel Deaconess Medical Center (BIDMC), Boston, MA, United States.

Epilepsy Res ; 207: 107451, 2024 Sep 10.

Article en En | MEDLINE | ID: mdl-39276641

ABSTRACT

ABSTRACT

OBJECTIVES:

Monitoring seizure control metrics is key to clinical care of patients with epilepsy. Manually abstracting these metrics from unstructured text in electronic health records (EHR) is laborious. We aimed to abstract the date of last seizure and seizure frequency from clinical notes of patients with epilepsy using natural language processing (NLP).

METHODS:

We extracted seizure control metrics from notes of patients seen in epilepsy clinics from two hospitals in Boston. Extraction was performed with the pretrained model RoBERTa_for_seizureFrequency_QA, for both date of last seizure and seizure frequency, combined with regular expressions. We designed the algorithm to categorize the timing of last seizure ("today", "1-6â¯days ago", "1-4 weeks ago", "more than 1-3â¯months ago", "more than 3-6â¯months ago", "more than 6-12â¯months ago", "more than 1-2â¯years ago", "more than 2 years ago") and seizure frequency ("innumerable", "multiple", "daily", "weekly", "monthly", "once per year", "less than once per year"). Our ground truth consisted of structured questionnaires filled out by physicians. Model performance was measured using the areas under the receiving operating characteristic curve (AUROC) and precision recall curve (AUPRC) for categorical labels, and median absolute error (MAE) for ordinal labels, with 95â¯% confidence intervals (CI) estimated via bootstrapping.

RESULTS:

Our cohort included 1773 adult patients with a total of 5658 visits with reported seizure control metrics, seen in epilepsy clinics between December 2018 and May 2022. The cohort average age was 42 years old, the majority were female (57â¯%), White (81â¯%) and non-Hispanic (85â¯%). The models achieved an MAE (95â¯% CI) for date of last seizure of 4 (4.00-4.86) weeks, and for seizure frequency of 0.02 (0.02-0.02) seizures per day.

CONCLUSIONS:

Our NLP approach demonstrates that the extraction of seizure control metrics from EHR is feasible allowing for large-scale EHR research.

Palabras clave

Electronic health records; Epilepsy; Large language model; Natural language processing; Phenotyping; Seizure control

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Idioma: En Revista: Epilepsy Res Asunto de la revista: CEREBRO / NEUROLOGIA Año: 2024 Tipo del documento: Article

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google