Your browser doesn't support javascript.
loading
Cross-lingual Unified Medical Language System entity linking in online health communities.
Bitton, Yonatan; Cohen, Raphael; Schifter, Tamar; Bachmat, Eitan; Elhadad, Michael; Elhadad, Noémie.
Afiliação
  • Bitton Y; Department of Computer Science, Ben Gurion University, Beer Sheva, Israel.
  • Cohen R; Department of Computer Science, Ben Gurion University, Beer Sheva, Israel.
  • Schifter T; Gertner Institute for Epidemiology and Health Policy Research, Tel HaShomer, Israel.
  • Bachmat E; Department of Computer Science, Ben Gurion University, Beer Sheva, Israel.
  • Elhadad M; Department of Computer Science, Ben Gurion University, Beer Sheva, Israel.
  • Elhadad N; Department of Biomedical Informatics, Columbia University, New York, New York, USA.
J Am Med Inform Assoc ; 27(10): 1585-1592, 2020 10 01.
Article em En | MEDLINE | ID: mdl-32910823
OBJECTIVE: In Hebrew online health communities, participants commonly write medical terms that appear as transliterated forms of a source term in English. Such transliterations introduce high variability in text and challenge text-analytics methods. To reduce their variability, medical terms must be normalized, such as linking them to Unified Medical Language System (UMLS) concepts. We present a method to identify both transliterated and translated Hebrew medical terms and link them with UMLS entities. MATERIALS AND METHODS: We investigate the effect of linking terms in Camoni, a popular Israeli online health community in Hebrew. Our method, MDTEL (Medical Deep Transliteration Entity Linking), includes (1) an attention-based recurrent neural network encoder-decoder to transliterate words and mapping UMLS from English to Hebrew, (2) an unsupervised method for creating a transliteration dataset in any language without manually labeled data, and (3) an efficient way to identify and link medical entities in the Hebrew corpus to UMLS concepts, by producing a high-recall list of candidate medical terms in the corpus, and then filtering the candidates to relevant medical terms. RESULTS: We carry out experiments on 3 disease-specific communities: diabetes, multiple sclerosis, and depression. MDTEL tagging and normalizing on Camoni posts achieved 99% accuracy, 92% recall, and 87% precision. When tagging and normalizing terms in queries from the Camoni search logs, UMLS-normalized queries improved search results in 46% of the cases. CONCLUSIONS: Cross-lingual UMLS entity linking from Hebrew is possible and improves search performance across communities. Annotated datasets, annotation guidelines, and code are made available online (https://github.com/yonatanbitton/mdtel).
Assuntos
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Tradução / Unified Medical Language System Limite: Humans País/Região como assunto: Asia Idioma: En Revista: J Am Med Inform Assoc Assunto da revista: INFORMATICA MEDICA Ano de publicação: 2020 Tipo de documento: Article País de afiliação: Israel País de publicação: Reino Unido

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Tradução / Unified Medical Language System Limite: Humans País/Região como assunto: Asia Idioma: En Revista: J Am Med Inform Assoc Assunto da revista: INFORMATICA MEDICA Ano de publicação: 2020 Tipo de documento: Article País de afiliação: Israel País de publicação: Reino Unido