Semi-supervised distributed representations of documents for sentiment analysis.

Park, Saerom; Lee, Jaewook; Kim, Kyoungok

Park, Saerom; Lee, Jaewook; Kim, Kyoungok.

Afiliação

Park S; Department of Convergence Security Engineering, Sungshin University, 2 Bomunro, 34Da-Gil, Seongbuk-gu, Seoul, 02844, Republic of Korea.
Lee J; Industrial Engineering, Seoul National University, 1 Gwanakro, Gwanak-gu, Seoul, 08826, Republic of Korea.
Kim K; Information Technology Management Programme, International Fusion School, Seoul National University of Science & Technology (SeoulTech), 232 Gongreungno, Nowon-gu, Seoul, 01811, Republic of Korea. Electronic address: drsaerompark@gmail.com.

Neural Netw ; 119: 139-150, 2019 Nov.

Article em En | MEDLINE | ID: mdl-31425854

ABSTRACT

ABSTRACT

Learning document representation is important in applying machine learning algorithms for sentiment analysis. Distributed representation learning models of words and documents, one of neural language models, have overcome some limits of vector space models such as bag-of-words model and have been utilized successively in many natural language processing tasks including sentiment analysis. However, because such models learn the embeddings only with a context-based objective, it is hard for embeddings to reflect the sentiment of texts. In this research, we address this problem by introducing a semi-supervised sentiment-discriminative objective using partial sentiment information of documents. Our method not only reflects the partial sentiment information, but also preserves local structures induced from original distributed representation learning objectives by considering only sentiment relationships between neighboring documents. Using real-world datasets, the proposed method has been validated by sentiment visualization and classification tasks. The visualization results of Amazon review datasets demonstrate the enhancement of the sentiment class separation when document representations of our proposed method are compared to other methods. Sentiment prediction from our representations also appears to be consistently superior to other representations in both Amazon and Yelp datasets. This work can be extended to develop effective document embeddings applied to other discriminative tasks.

Assuntos

Aprendizado de Máquina; Processamento de Linguagem Natural; Algoritmos; Humanos; Idioma

Palavras-chave

Discriminative learning; Distributed representation; Natural language processing; Neural probabilistic language model; Semi-supervised representation learning; Sentiment analysis

Texto completo

Adicionar na Minha BVS

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Bases de dados: MEDLINE Assunto principal: Processamento de Linguagem Natural / Aprendizado de Máquina Tipo de estudo: Prognostic_studies Limite: Humans Idioma: En Revista: Neural Netw Assunto da revista: NEUROLOGIA Ano de publicação: 2019 Tipo de documento: Article

Texto completo

Adicionar na Minha BVS

Imprimir

XML

PubMed Links

Buscar no Google