Detecting Hypoglycemia Incidents Reported in Patients' Secure Messages: Using Cost-Sensitive Learning and Oversampling to Reduce Data Imbalance.

Chen, Jinying; Lalor, John; Liu, Weisong; Druhl, Emily; Granillo, Edgard; Vimalananda, Varsha G; Yu, Hong

Chen, Jinying; Lalor, John; Liu, Weisong; Druhl, Emily; Granillo, Edgard; Vimalananda, Varsha G; Yu, Hong.

Afiliação

Chen J; Department of Population and Quantitative Health Sciences, University of Massachusetts Medical School, Worcester, MA, United States.
Lalor J; Bedford Veterans Affairs Medical Center, Center for Healthcare Organization and Implementation Research, Bedford, MA, United States.
Liu W; Bedford Veterans Affairs Medical Center, Center for Healthcare Organization and Implementation Research, Bedford, MA, United States.
Druhl E; College of Information and Computer Sciences, University of Massachusetts Amherst, Amherst, MA, United States.
Granillo E; Bedford Veterans Affairs Medical Center, Center for Healthcare Organization and Implementation Research, Bedford, MA, United States.
Vimalananda VG; Department of Computer Science, University of Massachusetts Lowell, Lowell, MA, United States.
Yu H; Bedford Veterans Affairs Medical Center, Center for Healthcare Organization and Implementation Research, Bedford, MA, United States.

J Med Internet Res ; 21(3): e11990, 2019 03 11.

Article em En | MEDLINE | ID: mdl-30855231

ABSTRACT

ABSTRACT

BACKGROUND:

Improper dosing of medications such as insulin can cause hypoglycemic episodes, which may lead to severe morbidity or even death. Although secure messaging was designed for exchanging nonurgent messages, patients sometimes report hypoglycemia events through secure messaging. Detecting these patient-reported adverse events may help alert clinical teams and enable early corrective actions to improve patient safety.

OBJECTIVE:

We aimed to develop a natural language processing system, called HypoDetect (Hypoglycemia Detector), to automatically identify hypoglycemia incidents reported in patients' secure messages.

METHODS:

An expert in public health annotated 3000 secure message threads between patients with diabetes and US Department of Veterans Affairs clinical teams as containing patient-reported hypoglycemia incidents or not. A physician independently annotated 100 threads randomly selected from this dataset to determine interannotator agreement. We used this dataset to develop and evaluate HypoDetect. HypoDetect incorporates 3 machine learning algorithms widely used for text classification linear support vector machines, random forest, and logistic regression. We explored different learning features, including new knowledge-driven features. Because only 114 (3.80%) messages were annotated as positive, we investigated cost-sensitive learning and oversampling methods to mitigate the challenge of imbalanced data.

RESULTS:

The interannotator agreement was Cohen kappa=.976. Using cross-validation, logistic regression with cost-sensitive learning achieved the best performance (area under the receiver operating characteristic curve=0.954, sensitivity=0.693, specificity 0.974, F1 score=0.590). Cost-sensitive learning and the ensembled synthetic minority oversampling technique improved the sensitivity of the baseline systems substantially (by 0.123 to 0.728 absolute gains). Our results show that a variety of features contributed to the best performance of HypoDetect.

CONCLUSIONS:

Despite the challenge of data imbalance, HypoDetect achieved promising results for the task of detecting hypoglycemia incidents from secure messages. The system has a great potential to facilitate early detection and treatment of hypoglycemia.

Assuntos

Registros Eletrônicos de Saúde/normas; Hipoglicemia/diagnóstico; Processamento de Linguagem Natural; Mídias Sociais/normas; Feminino; Humanos; Masculino

Palavras-chave

adverse event detection; drug-related side effects and adverse reactions; hypoglycemia; imbalanced data; natural language processing; secure messaging; supervised machine learning

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Processamento de Linguagem Natural / Registros Eletrônicos de Saúde / Mídias Sociais / Hipoglicemia Tipo de estudo: Diagnostic_studies / Health_economic_evaluation / Prognostic_studies / Screening_studies Limite: Female / Humans / Male Idioma: En Ano de publicação: 2019 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google