Pesquisa | Biblioteca Virtual em Saúde

SPBERE: Boosting span-based pipeline biomedical entity and relation extraction via entity information.

Yang, Chenglin; Deng, Jiamei; Chen, Xianlai; An, Ying.

J Biomed Inform ; 145: 104456, 2023 09.

Artigo em Inglês | MEDLINE | ID: mdl-37482171

RESUMO

Triplet extraction is one of the fundamental tasks in biomedical text mining. Compared with traditional pipeline approaches, joint methods can alleviate the error propagation problem from entity recognition to relation classification. However, existing methods face challenges in detecting overlapping entities and overlapping relations, which are ubiquitous in biomedical texts. In this work, we propose a novel pipeline method of end-to-end biomedical triplet extraction. In particular, a span-based detection strategy is used to detect the overlapping triplets by enumerating possible candidate spans and entity pairs. The strategy is further used to capture different contextualized representations via an entity model and a relation model, respectively. Furthermore, to enhance interrelation between spans, entity information from the output of the entity model is used to construct the input for the relation model without utilizing any external knowledge. Our approach is evaluated on the drug-drug interaction (DDI) and chemical-protein interaction (CHEMPROT) datasets, exhibiting improvement of the absolute F1-score in relation extraction by 3.5%-3.7% compared prior work. The experimental results highlight the importance of overlapping triplet detection using the span-based approach, acquisition of various contextualized representations via different in-domain pre-trained language models, and early fusion of entity information in the relation model.

Assuntos

Mineração de Dados , Idioma , Mineração de Dados/métodos , Processamento de Linguagem Natural , Proteínas , Interações Medicamentosas

Span-based model for overlapping entity recognition and multi-relations classification in the food domain.

Zhang, Mengqi; Ma, Lei; Ren, Yanzhao; Zhang, Ganggang; Liu, Xinliang.

Math Biosci Eng ; 19(5): 5134-5152, 2022 03 18.

Artigo em Inglês | MEDLINE | ID: mdl-35430857

RESUMO

Information extraction (IE) is an important part of the entire knowledge graph lifecycle. In the food domain, extracting information such as ingredient and cooking method from Chinese recipes is crucial to safety risk analysis and identification of ingredient. In comparison with English, due to the complex structure, the richness of information in word combination, and lack of tense, Chinese IE is much more challenging. This dilemma is particularly prominent in the food domain with high-density knowledge, imprecise syntactic structure. However, existing IE methods focus only on the features of entities in a sentence, such as context and position, and ignore features of the entity itself and the influence of self attributes on prediction of inter entity relationship. To solve the problems of overlapping entity recognition and multi-relations classification in the food domain, we propose a span-based model known as SpIE for IE. The SpIE uses the span representation for each possible candidate entity to capture span-level features, which transforms named entity recognition (NER) into a classification mission. Besides, SpIE feeds extra information about the entity into the relation classification (RC) model by considering the effect of entity's attributes (both the entity mention and entity type) on the relationship between entity pairs. We apply SpIE on two datasets and observe that SpIE significantly outperforms the previous neural approaches due to capture the feature of overlapping entity and entity attributes, and it remains very competitive in general IE.

Assuntos

Armazenamento e Recuperação da Informação , Idioma , Projetos de Pesquisa , Medição de Risco

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA