Towards understanding the role of content-based and contextualized features in detecting abuse on Twitter.

Hussain, Kamal; Saeed, Zafar; Abbasi, Rabeeh; Sindhu, Muddassar; Khattak, Akmal; Arafat, Sachi; Daud, Ali; Mushtaq, Mubashar

Hussain, Kamal; Saeed, Zafar; Abbasi, Rabeeh; Sindhu, Muddassar; Khattak, Akmal; Arafat, Sachi; Daud, Ali; Mushtaq, Mubashar.

Afiliação

Hussain K; Instituto Superior Técnico, Universidade de Lisboa, Portugal.
Saeed Z; Dipartimento di Informatica, Università degli Studi di Bari, Bari, Italy.
Abbasi R; Department of Computer Science, Quaid-i-Azam University, Islamabad, Pakistan.
Sindhu M; Department of Computer Science, Quaid-i-Azam University, Islamabad, Pakistan.
Khattak A; Department of Computer Science, Quaid-i-Azam University, Islamabad, Pakistan.
Arafat S; Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia.
Daud A; Faculty of Resilience, Rabdan Academy, Abu Dhabi, United Arab Emirates.
Mushtaq M; Department of Computer Science, Forman Christian College, Lahore, Pakistan.

Heliyon ; 10(8): e29593, 2024 Apr 30.

Article em En | MEDLINE | ID: mdl-38665572

ABSTRACT

ABSTRACT

This paper presents a novel approach for detecting abuse on Twitter. Abusive posts have become a major problem for social media platforms like Twitter. It is important to identify abuse to mitigate its potential harm. Many researchers have proposed methods to detect abuse on Twitter. However, most of the existing approaches for detecting abuse look only at the content of the abusive tweet in isolation and do not consider its contextual information, particularly the tweets posted before the abusive tweet. In this paper, we propose a new method for detecting abuse that uses contextual information from the tweets that precede and follow the abusive tweet. We hypothesize that this contextual information can be used to better understand the intent of the abusive tweet and to identify abuse that content-based methods would otherwise miss. We performed extensive experiments to identify the best combination of features and machine learning algorithms to detect abuse on Twitter. We test eight different machine learning classifiers on content- and context-based features for the experiments. The proposed method is compared with existing abuse detection methods and achieves an absolute improvement of around 7%. The best results are obtained by combining the content and context-based features. The highest accuracy of the proposed method is 86%, whereas the existing methods used for comparison have highest accuracy of 79.2%.

Palavras-chave

Abuse; Context; Machine learning; Social media; Twitter

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: Heliyon Ano de publicação: 2024 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: Heliyon Ano de publicação: 2024 Tipo de documento: Article