Addressing cyberbullying in Urdu tweets: a comprehensive dataset and detection system.

Adeeba, Farah; Yousuf, Muhammad Irfan; Anwer, Izza; Tariq, Sardar Umair; Ashfaq, Abdullah; Naqeeb, Malik

Adeeba, Farah; Yousuf, Muhammad Irfan; Anwer, Izza; Tariq, Sardar Umair; Ashfaq, Abdullah; Naqeeb, Malik.

Afiliação

Adeeba F; Department of Computer Science, University of Engineering and Technology Lahore, Lahore, Punjab, Pakistan.
Yousuf MI; Department of Computer Science, University of Engineering and Technology Lahore, Lahore, Punjab, Pakistan.
Anwer I; Department of Transportation Engineering and Management, University of Engineering and Technology Lahore, Lahore, Punjab, Pakistan.
Tariq SU; Department of Computer Science, University of Engineering and Technology Lahore, Lahore, Punjab, Pakistan.
Ashfaq A; Department of Computer Science, University of Engineering and Technology Lahore, Lahore, Punjab, Pakistan.
Naqeeb M; Department of Computer Science, University of Engineering and Technology Lahore, Lahore, Punjab, Pakistan.

PeerJ Comput Sci ; 10: e1963, 2024.

Article em En | MEDLINE | ID: mdl-38699209

ABSTRACT

ABSTRACT

The prevalence of cyberbullying has reached an alarming rate, affecting approximately 54% of teenagers who experience various forms of cyberbullying, including offensive hate speech, threats, and racism. This research introduces a comprehensive dataset and system for cyberbullying detection in Urdu tweets, leveraging a spectrum of machine learning approaches including traditional models and advanced deep learning techniques. The objectives of this study are threefold. Firstly, a dataset consisting of 12,500 annotated tweets in Urdu is created, and it is made publicly available to the research community. Secondly, annotation guidelines for Urdu text with appropriate labels for cyberbullying detection are developed. Finally, a series of experiments is conducted to assess the performance of machine learning and deep learning techniques in detecting cyberbullying. The results indicate that fastText deep learning models outperform other models in cyberbullying detection. This study demonstrates its efficacy in effectively detecting and classifying cyberbullying incidents in Urdu tweets, contributing to the broader effort of creating a safer digital environment.

Palavras-chave

Cyberbullying annotation guidelines; Urdu cyberbullying detection; Urdu sentiment analysis; Urdu tweets dataset

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Ano de publicação: 2024 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Ano de publicação: 2024 Tipo de documento: Article