Your browser doesn't support javascript.
loading
CODER: Knowledge-infused cross-lingual medical term embedding for term normalization.
Yuan, Zheng; Zhao, Zhengyun; Sun, Haixia; Li, Jiao; Wang, Fei; Yu, Sheng.
Afiliação
  • Yuan Z; Center for Statistical Science, Tsinghua University, Beijing, China; Department of Industrial Engineering, Tsinghua University, Beijing, China.
  • Zhao Z; Center for Statistical Science, Tsinghua University, Beijing, China; Department of Industrial Engineering, Tsinghua University, Beijing, China.
  • Sun H; Institute of Medical Information, Chinese Academy of Medical Sciences/Peking Union Medical College, Beijing, China.
  • Li J; Institute of Medical Information, Chinese Academy of Medical Sciences/Peking Union Medical College, Beijing, China.
  • Wang F; Department of Population Health Sciences, Weill Cornell Medical College, Cornell University, NY, USA.
  • Yu S; Center for Statistical Science, Tsinghua University, Beijing, China; Department of Industrial Engineering, Tsinghua University, Beijing, China. Electronic address: syu@tsinhua.edu.cn.
J Biomed Inform ; 126: 103983, 2022 02.
Article em En | MEDLINE | ID: mdl-34990838
ABSTRACT

OBJECTIVE:

This paper aims to propose knowledge-aware embedding, a critical tool for medical term normalization.

METHODS:

We develop CODER (Cross-lingual knowledge-infused medical term embedding) via contrastive learning based on a medical knowledge graph (KG) named the Unified Medical Language System, and similarities are calculated utilizing both terms and relation triplets from the KG. Training with relations injects medical knowledge into embeddings and can potentially improve their performance as machine learning features.

RESULTS:

We evaluate CODER based on zero-shot term normalization, semantic similarity, and relation classification benchmarks, and the results show that CODER outperforms various state-of-the-art biomedical word embeddings, concept embeddings, and contextual embeddings.

CONCLUSION:

CODER embeddings excellently reflect semantic similarity and relatedness of medical concepts. One can use CODER for embedding-based medical term normalization or to provide features for machine learning. Similar to other pretrained language models, CODER can also be fine-tuned for specific tasks. Codes and models are available at https//github.com/GanjinZero/CODER.
Assuntos
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Processamento de Linguagem Natural / Unified Medical Language System Idioma: En Revista: J Biomed Inform Assunto da revista: INFORMATICA MEDICA Ano de publicação: 2022 Tipo de documento: Article País de afiliação: China

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Processamento de Linguagem Natural / Unified Medical Language System Idioma: En Revista: J Biomed Inform Assunto da revista: INFORMATICA MEDICA Ano de publicação: 2022 Tipo de documento: Article País de afiliação: China
...