Machine learning-based natural language processing to extract PD-L1 expression levels from clinical notes.
Health Informatics J
; 29(3): 14604582231198021, 2023.
Article
en En
| MEDLINE
| ID: mdl-37635280
ABSTRACT
Introduction:
PD-L1 expression is used to determine oncology patients' response to and eligibility for immunologic treatments; however, PD-L1 expression status often only exists in unstructured clinical notes, limiting ability to use it in population-level studies.Methods:
We developed and evaluated a machine learning based natural language processing (NLP) tool to extract PD-L1 expression values from the nationwide Veterans Affairs electronic health record system.Results:
The model demonstrated strong evaluation performance across multiple levels of label granularity. Mean precision of the overall PD-L1 positive label was 0.859 (sd, 0.039), recall 0.994 (sd, 0.013), and F1 0.921 (0.024). When a numeric PD-L1 value was identified, the mean absolute error of the value was 0.537 on a scale of 0 to 100.Conclusion:
We presented an accurate NLP method for deriving PD-L1 status from clinical notes. By reducing the time and manual effort needed to review medical records, our work will enable future population-level studies in cancer immunotherapy.Palabras clave
Texto completo:
1
Colección:
01-internacional
Banco de datos:
MEDLINE
Asunto principal:
Procesamiento de Lenguaje Natural
/
Antígeno B7-H1
Tipo de estudio:
Guideline
Límite:
Humans
Idioma:
En
Revista:
Health Informatics J
Año:
2023
Tipo del documento:
Article
País de afiliación:
Estados Unidos