RESUMEN
To tackle the COVID-19 infodemic, we analysed 58,625 articles from 460 unverified sources, that is, sources that were indicated by fact checkers and other mis/disinformation experts as frequently spreading mis/disinformation, covering the period from 1 January 2020 to 31 December 2022. Our aim was to identify the main narratives of COVID-19 mis/disinformation, develop a codebook, automate the process of narrative classification by training an automatic classifier, and analyse the spread of narratives over time and across countries. Articles were retrieved with a customised version of the Europe Media Monitor (EMM) processing chain providing a stream of text items. Machine translation was employed to automatically translate non-English text to English and clustering was carried out to group similar articles. A multi-level codebook of COVID-19 mis/disinformation narratives was developed following an inductive approach; a transformer-based model was developed to classify all text items according to the codebook. Using the transformer-based model, we identified 12 supernarratives that evolved over the three years studied. The analysis shows that there are often real events behind mis/disinformation trends, which unverified sources misrepresent or take out of context. We established a process that allows for near real-time monitoring of COVID-19 mis/disinformation. This experience will be useful to analyse mis/disinformation about other topics, such as climate change, migration, and geopolitical developments.