Extracting biomedical relation from cross-sentence text using syntactic dependency graph attention network.
J Biomed Inform
; 144: 104445, 2023 08.
Article
in En
| MEDLINE
| ID: mdl-37467835
ABSTRACT
In biomedical literature, cross-sentence texts can usually express rich knowledge, and extracting the interaction relation between entities from cross-sentence texts is of great significance to biomedical research. However, compared with single sentence, cross-sentence text has a longer sequence length, so the research on cross-sentence text information extraction should focus more on learning the context dependency structural information. Nowadays, it is still a challenge to handle global dependencies and structural information of long sequences effectively, and graph-oriented modeling methods have received more and more attention recently. In this paper, we propose a new graph attention network guided by syntactic dependency relationship (SR-GAT) for extracting biomedical relation from the cross-sentence text. It allows each node to pay attention to other nodes in its neighborhood, regardless of the sequence length. The attention weight between nodes is given by a syntactic relation graph probability network (SR-GPR), which encodes the syntactic dependency between nodes and guides the graph attention mechanism to learn information about the dependency structure. The learned feature representation retains information about the node-to-node syntactic dependency, and can further discover global dependencies effectively. The experimental results demonstrate on a publicly available biomedical dataset that, our method achieves state-of-the-art performance while requiring significantly less computational resources. Specifically, in the "drug-mutation" relation extraction task, our method achieves an advanced accuracy of 93.78% for binary classification and 92.14% for multi-classification. In the "drug-gene-mutation" relation extraction task, our method achieves an advanced accuracy of 93.22% for binary classification and 92.28% for multi-classification. Across all relation extraction tasks, our method improves accuracy by an average of 0.49% compared to the existing best model. Furthermore, our method achieved an accuracy of 69.5% in text classification, surpassing most existing models, demonstrating its robustness in generalization across different domains without additional fine-tuning.
Key words
Full text:
1
Collection:
01-internacional
Database:
MEDLINE
Main subject:
Biomedical Research
/
Language
Language:
En
Journal:
J Biomed Inform
Journal subject:
INFORMATICA MEDICA
Year:
2023
Type:
Article
Affiliation country:
China