Overview of the 8<sup>th</sup> Social Media Mining for Health Applications (#SMM4H) Shared Tasks at the AMIA 2023 Annual Symposium.

Klein, Ari Z; Banda, Juan M; Guo, Yuting; Schmidt, Ana Lucia; Xu, Dongfang; Amaro, Jesus Ivan Flores; Rodriguez-Esteban, Raul; Sarker, Abeed; Gonzalez-Hernandez, Graciela

Overview of the 8^th Social Media Mining for Health Applications (#SMM4H) Shared Tasks at the AMIA 2023 Annual Symposium.

Klein, Ari Z; Banda, Juan M; Guo, Yuting; Schmidt, Ana Lucia; Xu, Dongfang; Amaro, Jesus Ivan Flores; Rodriguez-Esteban, Raul; Sarker, Abeed; Gonzalez-Hernandez, Graciela.

Afiliação

Klein AZ; Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia, PA, USA.
Banda JM; Department of Computer Science, Georgia State University, Atlanta, GA, USA.
Guo Y; Department of Biomedical Informatics, Emory University, Atlanta, GA, USA.
Schmidt AL; Roche Innovation Center, Basel, Switzerland.
Xu D; Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA, USA.
Amaro JIF; Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA, USA.
Rodriguez-Esteban R; Roche Innovation Center, Basel, Switzerland.
Sarker A; Department of Biomedical Informatics, Emory University, Atlanta, GA, USA.
Gonzalez-Hernandez G; Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA, USA.

medRxiv ; 2023 Nov 08.

Article em En | MEDLINE | ID: mdl-37986776

ABSTRACT

ABSTRACT

The aim of the Social Media Mining for Health Applications (#SMM4H) shared tasks is to take a community-driven approach to address the natural language processing and machine learning challenges inherent to utilizing social media data for health informatics. The eighth iteration of the #SMM4H shared tasks was hosted at the AMIA 2023 Annual Symposium and consisted of five tasks that represented various social media platforms (Twitter and Reddit), languages (English and Spanish), methods (binary classification, multi-class classification, extraction, and normalization), and topics (COVID-19, therapies, social anxiety disorder, and adverse drug events). In total, 29 teams registered, representing 18 countries. In this paper, we present the annotated corpora, a technical summary of the systems, and the performance results. In general, the top-performing systems used deep neural network architectures based on pre-trained transformer models. In particular, the top-performing systems for the classification tasks were based on single models that were pre-trained on social media corpora. To facilitate future work, the datasets-a total of 61,353 posts-will remain available by request, and the CodaLab sites will remain active for a post-evaluation phase.

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Idioma: En Ano de publicação: 2023 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Idioma: En Ano de publicação: 2023 Tipo de documento: Article