RESUMO
Social media have democratized content creation and have made it easy for anybody to spread information online. However, stripping traditional media from their gate-keeping role has left the public unprotected against biased, deceptive and disinformative content, which could now travel online at breaking-news speed and influence major public events. For example, during the COVID-19 pandemic, a new blending of medical and political disinformation has given rise to the first global infodemic. We offer an overview of the emerging and inter-connected research areas of fact-checking, disinformation, "fake news'', propaganda, and media bias detection. We explore the general fact-checking pipeline and important elements thereof such as check-worthiness estimation, spotting previously fact-checked claims, stance detection, source reliability estimation, detection of persuasion techniques, and detecting malicious users in social media. We also cover large-scale pre-trained language models, and the challenges and opportunities they offer for generating and for defending against neural fake news. Finally, we discuss the ongoing COVID-19 infodemic. © 2022 ACM.
RESUMO
The rise of Internet and social media changed not only how we consume information, but it also democratized the process of content creation and dissemination, thus making it easily available to anybody. Despite the hugely positive impact, this situation has the downside that the public was left unprotected against biased, deceptive, and disinformative content, which could now travel online at breaking-news speed and allegedly influence major events such as political elections, or disturb the efforts of governments and health officials to fight the ongoing COVID-19 pandemic. The research community responded to the issue, proposing a number of inter-connected research directions such as fact-checking, disinformation, misinformation, fake news, propaganda, and media bias detection. Below, we cover the mainstream research, and we also pay attention to less popular, but emerging research directions, such as propaganda detection, check-worthiness estimation, detecting previously fact-checked claims, and multimodality, which are of interest to human fact-checkers and journalists. We further cover relevant topics such as stance detection, source reliability estimation, detection of persuasion techniques in text and memes, and detecting malicious users in social media. Moreover, we discuss large-scale pre-trained language models, and the challenges and opportunities they offer for generating and for defending against neural fake news. Finally, we explore some recent efforts aiming at flattening the curve of the COVID-19 infodemic. © 2021 ACM.
RESUMO
We describe the fourth edition of the CheckThat! Lab, part of the 2021 Conference and Labs of the Evaluation Forum (CLEF). The lab evaluates technology supporting tasks related to factuality, and covers Arabic, Bulgarian, English, Spanish, and Turkish. Task 1 asks to predict which posts in a Twitter stream are worth fact-checking, focusing on COVID-19 and politics (in all five languages). Task 2 asks to determine whether a claim in a tweet can be verified using a set of previously fact-checked claims (in Arabic and English). Task 3 asks to predict the veracity of a news article and its topical domain (in English). The evaluation is based on mean average precision or precision at rank k for the ranking tasks, and macro-F1 for the classification tasks. This was the most popular CLEF-2021 lab in terms of team registrations: 132 teams. Nearly one-third of them participated: 15, 5, and 25 teams submitted official runs for tasks 1, 2, and 3, respectively. © 2021, Springer Nature Switzerland AG.
RESUMO
The rise of social media has democratized content creation and has made it easy for anybody to share and to spread information online. On the positive side, this has given rise to citizen journalism, thus enabling much faster dissemination of information compared to what was possible with newspapers, radio, and TV. On the negative side, stripping traditional media from their gate-keeping role has left the public unprotected against the spread of disinformation, which could now travel at breaking-news speed over the same democratic channel. This situation gave rise to the proliferation of false information, specifically created to affect individual people's beliefs, and ultimately to influence major events such as political elections;it also set the dawn of the Post-Truth Era, where appeal to emotions has become more important than the truth. More recently, with the emergence of the COVID-19 pandemic, a new blending of medical and political misinformation and disinformation has given rise to the first global infodemic. Limiting the impact of these negative developments has become a major focus for journalists, social media companies, and regulatory authorities. We offer an overview of the emerging and inter-connected research areas of fact-checking, misinformation, disinformation, "fake news'', propaganda, and media bias detection, with focus on text and computational approaches. We explore the general fact-checking pipeline and important elements thereof such as check-worthiness estimation, spotting previously fact-checked claims, stance detection, source reliability estimation, detection of persuasion/propaganda techniques in text and memes, and detecting malicious users in social media. We further cover large-scale pre-trained language models, and the challenges and opportunities they offer for generating and for defending against neural fake news. Finally, we explore some recent efforts towards flattening the curve of the COVID-19 infodemic. © 2021 Owner/Author.
RESUMO
We present an overview of Task 1 of the fourth edition of the CheckThat! Lab, part of the 2021 Conference and Labs of the Evaluation Forum (CLEF). The task asks to predict which posts in a Twitter stream are worth fact-checking, focusing on COVID-19 and politics in five languages: Arabic, Bulgarian, English, Spanish, and Turkish. A total of 15 teams participated in this task and most submissions managed to achieve sizable improvements over the baselines using Transformer-based models such as BERT and RoBERTa. Here, we describe the process of data collection and the task setup, including the evaluation measures, and we give a brief overview of the participating systems. We release to the research community all datasets from the lab as well as the evaluation scripts, which should enable further research in check-worthiness estimation for tweets and political debates. © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
RESUMO
We describe the fourth edition of the CheckThat! Lab, part of the 2021 Conference and Labs of the Evaluation Forum (CLEF). The lab evaluates technology supporting three tasks related to factuality, and it covers Arabic, Bulgarian, English, Spanish, and Turkish. Here, we present the task 2, which asks to detect previously fact-checked claims (in two languages). A total of four teams participated in this task, submitted a total of sixteen runs, and most submissions managed to achieve sizable improvements over the baselines using transformer based models such as BERT, RoBERTa. In this paper, we describe the process of data collection and the task setup, including the evaluation measures used, and we give a brief overview of the participating systems. Last but not least, we release to the research community all datasets from the lab as well as the evaluation scripts, which should enable further research in detecting previously fact-checked claims. © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
RESUMO
We describe the fourth edition of the CheckThat! Lab, part of the 2021 Cross-Language Evaluation Forum (CLEF). The lab evaluates technology supporting various tasks related to factuality, and it is offered in Arabic, Bulgarian, English, and Spanish. Task 1 asks to predict which tweets in a Twitter stream are worth fact-checking (focusing on COVID-19). Task 2 asks to determine whether a claim in a tweet can be verified using a set of previously fact-checked claims. Task 3 asks to predict the veracity of a target news article and its topical domain. The evaluation is carried out using mean average precision or precision at rank k for the ranking tasks, and F1 for the classification tasks. © 2021, Springer Nature Switzerland AG.