Selective prediction for extracting unstructured clinical data.
J Am Med Inform Assoc
; 31(1): 188-197, 2023 12 22.
Article
en En
| MEDLINE
| ID: mdl-37769323
ABSTRACT
OBJECTIVE:
While there are currently approaches to handle unstructured clinical data, such as manual abstraction and structured proxy variables, these methods may be time-consuming, not scalable, and imprecise. This article aims to determine whether selective prediction, which gives a model the option to abstain from generating a prediction, can improve the accuracy and efficiency of unstructured clinical data abstraction. MATERIALS ANDMETHODS:
We trained selective classifiers (logistic regression, random forest, support vector machine) to extract 5 variables from clinical notes depression (n = 1563), glioblastoma (GBM, n = 659), rectal adenocarcinoma (DRA, n = 601), and abdominoperineal resection (APR, n = 601) and low anterior resection (LAR, n = 601) of adenocarcinoma. We varied the cost of false positives (FP), false negatives (FN), and abstained notes and measured total misclassification cost.RESULTS:
The depression selective classifiers abstained on anywhere from 0% to 97% of notes, and the change in total misclassification cost ranged from -58% to 9%. Selective classifiers abstained on 5%-43% of notes across the GBM and colorectal cancer models. The GBM selective classifier abstained on 43% of notes, which led to improvements in sensitivity (0.94 to 0.96), specificity (0.79 to 0.96), PPV (0.89 to 0.98), and NPV (0.88 to 0.91) when compared to a non-selective classifier and when compared to structured proxy variables.DISCUSSION:
We showed that selective classifiers outperformed both non-selective classifiers and structured proxy variables for extracting data from unstructured clinical notes.CONCLUSION:
Selective prediction should be considered when abstaining is preferable to making an incorrect prediction.
Texto completo:
1
Colección:
01-internacional
Base de datos:
MEDLINE
Asunto principal:
Adenocarcinoma
/
Máquina de Vectores de Soporte
Tipo de estudio:
Guideline
/
Prognostic_studies
/
Qualitative_research
/
Risk_factors_studies
Límite:
Humans
Idioma:
En
Revista:
J Am Med Inform Assoc
Asunto de la revista:
INFORMATICA MEDICA
Año:
2023
Tipo del documento:
Article
País de afiliación:
Estados Unidos