Adverse drug event and medication extraction in electronic health records via a cascading architecture with different sequence labeling models and word embeddings.

Dai, Hong-Jie; Su, Chu-Hsien; Wu, Chi-Shin

Dai, Hong-Jie; Su, Chu-Hsien; Wu, Chi-Shin.

Afiliação

Dai HJ; Department of Electrical Engineering, College of Electrical Engineering and Computer Science, National Kaohsiung University of Science and Technology, Kaohsiung, Taiwan.
Su CH; Department of Post-Baccalaureate Medicine, College of Medicine, Kaohsiung Medical University, Kaohsiung, Taiwan.
Wu CS; Department of Psychiatry, National Taiwan University Hospital, Taipei, Taiwan R.O.C.

J Am Med Inform Assoc ; 27(1): 47-55, 2020 01 01.

Article em En | MEDLINE | ID: mdl-31334805

RESUMO

OBJECTIVE: An adverse drug event (ADE) refers to an injury resulting from medical intervention related to a drug including harm caused by drugs or from the usage of drugs. Extracting ADEs from clinical records can help physicians associate adverse events to targeted drugs. MATERIALS AND METHODS: We proposed a cascading architecture to recognize medical concepts including ADEs, drug names, and entities related to drugs. The architecture includes a preprocessing method and an ensemble of conditional random fields (CRFs) and neural network-based models to respectively address the challenges of surrogate string and overlapping annotation boundaries observed in the employed ADEs and medication extraction (ADME) corpus. The effectiveness of applying different pretrained and postprocessed word embeddings for the ADME task was also studied. RESULTS: The empirical results showed that both CRFs and neural network-based models provide promising solution for the ADME task. The neural network-based models particularly outperformed CRFs in concept types involving narrative descriptions. Our best run achieved an overall micro F-score of 0.919 on the employed corpus. Our results also suggested that the Global Vectors for word representation embedding in general domain provides a very strong baseline, which can be further improved by applying the principal component analysis to generate more isotropic vectors. CONCLUSIONS: We have demonstrated that the proposed cascading architecture can handle the problem of overlapped annotations and further improve the overall recall and F-scores because the architecture enables the developed models to exploit more context information and forms an ensemble for creating a stronger recognizer.

Assuntos

Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos; Registros Eletrônicos de Saúde; Armazenamento e Recuperação da Informação/métodos; Processamento de Linguagem Natural; Redes Neurais de Computação; Algoritmos; Humanos; Narração; Terminologia como Assunto

Palavras-chave

adverse drug event; electronic health record; information extraction; named entity recognition; word embedding

Texto completo

Adicionar na Minha BVS

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Bases de dados: MEDLINE Assunto principal: Processamento de Linguagem Natural / Armazenamento e Recuperação da Informação / Redes Neurais de Computação / Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos / Registros Eletrônicos de Saúde Tipo de estudo: Prognostic_studies / Qualitative_research Limite: Humans Idioma: En Revista: J Am Med Inform Assoc Assunto da revista: INFORMATICA MEDICA Ano de publicação: 2020 Tipo de documento: Article País de afiliação: Taiwan

Texto completo

Adicionar na Minha BVS

Imprimir

XML

PubMed Links

Buscar no Google