Medical concept normalization in social media posts with recurrent neural networks.

Tutubalina, Elena; Miftahutdinov, Zulfat; Nikolenko, Sergey; Malykh, Valentin

Tutubalina, Elena; Miftahutdinov, Zulfat; Nikolenko, Sergey; Malykh, Valentin.

Afiliação

Tutubalina E; Kazan Federal University, 18 Kremlyovskaya street, Kazan 420008, Russian Federation; Insilico Medicine, Baltimore, MD 21218, United States. Electronic address: ElVTutubalina@kpfu.ru.
Miftahutdinov Z; Kazan Federal University, 18 Kremlyovskaya street, Kazan 420008, Russian Federation. Electronic address: zulfatmi@gmail.com.
Nikolenko S; Kazan Federal University, 18 Kremlyovskaya street, Kazan 420008, Russian Federation; St. Petersburg Department of the Steklov Mathematical Institute, 27 Fontanka, St. Petersburg 191023, Russian Federation; Neuromation OU, Tallinn 10111, Estonia. Electronic address: sergey@logic.pdmi.ras.ru.
Malykh V; Neural Systems and Deep Learning Laboratory, Moscow Institute of Physics and Technology, 9 bld. 7 Instituski per., Dolgoprudny 141700, Russian Federation; St. Petersburg Department of the Steklov Mathematical Institute, 27 Fontanka, St. Petersburg 191023, Russian Federation. Electronic address: vale

J Biomed Inform ; 84: 93-102, 2018 08.

Article em En | MEDLINE | ID: mdl-29906585

RESUMO

Text mining of scientific libraries and social media has already proven itself as a reliable tool for drug repurposing and hypothesis generation. The task of mapping a disease mention to a concept in a controlled vocabulary, typically to the standard thesaurus in the Unified Medical Language System (UMLS), is known as medical concept normalization. This task is challenging due to the differences in the use of medical terminology between health care professionals and social media texts coming from the lay public. To bridge this gap, we use sequence learning with recurrent neural networks and semantic representation of one- or multi-word expressions: we develop end-to-end architectures directly tailored to the task, including bidirectional Long Short-Term Memory, Gated Recurrent Units with an attention mechanism, and additional semantic similarity features based on UMLS. Our evaluation against a standard benchmark shows that recurrent neural networks improve results over an effective baseline for classification based on convolutional neural networks. A qualitative examination of mentions discovered in a dataset of user reviews collected from popular online health information platforms as well as a quantitative evaluation both show improvements in the semantic representation of health-related expressions in social media.

Assuntos

Mineração de Dados/métodos; Informática Médica/métodos; Processamento de Linguagem Natural; Redes Neurais de Computação; Mídias Sociais; Unified Medical Language System; Linguística; Preparações Farmacêuticas; Probabilidade; Semântica; Rede Social

Palavras-chave

Information extraction; Medical concept normalization; Natural language processing; Recurrent neural networks; Social media; User reviews

Texto completo

Adicionar na Minha BVS

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Informática Médica / Processamento de Linguagem Natural / Redes Neurais de Computação / Unified Medical Language System / Mineração de Dados / Mídias Sociais Tipo de estudo: Qualitative_research Aspecto: Determinantes_sociais_saude Idioma: En Revista: J Biomed Inform Assunto da revista: INFORMATICA MEDICA Ano de publicação: 2018 Tipo de documento: Article País de publicação: Estados Unidos

Texto completo

Adicionar na Minha BVS

Imprimir

XML

PubMed Links

Buscar no Google