Your browser doesn't support javascript.
loading
Joint extraction of Chinese medical entities and relations based on RoBERTa and single-module global pointer.
Li, Dongmei; Yang, Yu; Cui, Jinman; Meng, Xianghao; Qu, Jintao; Jiang, Zhuobin; Zhao, Yufeng.
Afiliação
  • Li D; School of Information Science and Technology, Beijing Forestry University, 100083, Beijing, China.
  • Yang Y; Engineering Research Center for Forestry-Oriented Intelligent Information Processing of National Forestry and Grassland Administration, 100083, Beijing, China.
  • Cui J; School of Information Science and Technology, Beijing Forestry University, 100083, Beijing, China.
  • Meng X; Engineering Research Center for Forestry-Oriented Intelligent Information Processing of National Forestry and Grassland Administration, 100083, Beijing, China.
  • Qu J; School of Information Science and Technology, Beijing Forestry University, 100083, Beijing, China.
  • Jiang Z; Engineering Research Center for Forestry-Oriented Intelligent Information Processing of National Forestry and Grassland Administration, 100083, Beijing, China.
  • Zhao Y; School of Information Science and Technology, Beijing Forestry University, 100083, Beijing, China.
BMC Med Inform Decis Mak ; 24(1): 218, 2024 Jul 31.
Article em En | MEDLINE | ID: mdl-39085892
ABSTRACT

BACKGROUND:

Most Chinese joint entity and relation extraction tasks in medicine involve numerous nested entities, overlapping relations, and other challenging extraction issues. In response to these problems, some traditional methods decompose the joint extraction task into multiple steps or multiple modules, resulting in local dependency in the meantime.

METHODS:

To alleviate this issue, we propose a joint extraction model of Chinese medical entities and relations based on RoBERTa and single-module global pointer, namely RSGP, which formulates joint extraction as a global pointer linking problem. Considering the uniqueness of Chinese language structure, we introduce the RoBERTa-wwm pre-trained language model at the encoding layer to obtain a better embedding representation. Then, we represent the input sentence as a third-order tensor and score each position in the tensor to prepare for the subsequent process of decoding the triples. In the end, we design a novel single-module global pointer decoding approach to alleviate the generation of redundant information. Specifically, we analyze the decoding process of single character entities individually, improving the time and space performance of RSGP to some extent.

RESULTS:

In order to verify the effectiveness of our model in extracting Chinese medical entities and relations, we carry out the experiments on the public dataset, CMeIE. Experimental results show that RSGP performs significantly better on the joint extraction of Chinese medical entities and relations, and achieves state-of-the-art results compared with baseline models.

CONCLUSION:

The proposed RSGP can effectively extract entities and relations from Chinese medical texts and help to realize the structure of Chinese medical texts, so as to provide high-quality data support for the construction of Chinese medical knowledge graphs.
Assuntos
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Processamento de Linguagem Natural Limite: Humans País/Região como assunto: Asia Idioma: En Revista: BMC Med Inform Decis Mak Assunto da revista: INFORMATICA MEDICA Ano de publicação: 2024 Tipo de documento: Article País de afiliação: China

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Processamento de Linguagem Natural Limite: Humans País/Região como assunto: Asia Idioma: En Revista: BMC Med Inform Decis Mak Assunto da revista: INFORMATICA MEDICA Ano de publicação: 2024 Tipo de documento: Article País de afiliação: China