Your browser doesn't support javascript.
loading
A method for constructing word sense embeddings based on word sense induction.
Sun, Yujia; Platos, Jan.
Afiliação
  • Sun Y; Department of Computer Science, Technical University of Ostrava, 17. Listopadu 2172/15, 70800, Ostrava-Poruba, Czech Republic. yujia.sun.st@vsb.cz.
  • Platos J; Institute of Network Information Security, Hebei GEO University, No. 136 East Huai΄an Road , Shijiazhuang, 050031, Hebei, China. yujia.sun.st@vsb.cz.
Sci Rep ; 13(1): 12945, 2023 Aug 09.
Article em En | MEDLINE | ID: mdl-37558764
ABSTRACT
Polysemy is an inherent characteristic of natural language. In order to make it easier to distinguish between different senses of polysemous words, we propose a method for encoding multiple different senses of polysemous words using a single vector. The method first uses a two-layer bidirectional long short-term memory neural network and a self-attention mechanism to extract the contextual information of polysemous words. Then, a K-means algorithm, which is improved by optimizing the density peaks clustering algorithm based on cosine similarity, is applied to perform word sense induction on the contextual information of polysemous words. Finally, the method constructs the corresponding word sense embedded representations of the polysemous words. The results of the experiments demonstrate that the proposed method produces better word sense induction than Euclidean distance, Pearson correlation, and KL-divergence and more accurate word sense embeddings than mean shift, DBSCAN, spectral clustering, and agglomerative clustering.

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: Sci Rep Ano de publicação: 2023 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: Sci Rep Ano de publicação: 2023 Tipo de documento: Article