Your browser doesn't support javascript.
loading
A platform for connecting social media data to domain-specific topics using large language models: an application to student mental health.
Ruocco, Leonard; Zhuang, Yuqian; Ng, Raymond; Munthali, Richard J; Hudec, Kristen L; Wang, Angel Y; Vereschagin, Melissa; Vigo, Daniel V.
Afiliação
  • Ruocco L; Data Science Institute, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada.
  • Zhuang Y; Department of Psychiatry, University of British Columbia, Vancouver, British Columbia V6T 2A1, Canada.
  • Ng R; Data Science Institute, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada.
  • Munthali RJ; Department of Psychiatry, University of British Columbia, Vancouver, British Columbia V6T 2A1, Canada.
  • Hudec KL; Data Science Institute, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada.
  • Wang AY; Department of Computer Science, University of British Columbia, Vancouver, British Columbia, V6T 1Z4, Canada.
  • Vereschagin M; Department of Psychiatry, University of British Columbia, Vancouver, British Columbia V6T 2A1, Canada.
  • Vigo DV; Department of Psychiatry, University of British Columbia, Vancouver, British Columbia V6T 2A1, Canada.
JAMIA Open ; 7(1): ooae001, 2024 Apr.
Article em En | MEDLINE | ID: mdl-38250583
ABSTRACT

Objectives:

To design a novel artificial intelligence-based software platform that allows users to analyze text data by identifying various coherent topics and parts of the data related to a specific research theme-of-interest (TOI). Materials and

Methods:

Our platform uses state-of-the-art unsupervised natural language processing methods, building on top of a large language model, to analyze social media text data. At the center of the platform's functionality is BERTopic, which clusters social media posts, forming collections of words representing distinct topics. A key feature of our platform is its ability to identify whole sentences corresponding to topic words, vastly improving the platform's ability to perform downstream similarity operations with respect to a user-defined TOI.

Results:

Two case studies on mental health among university students are performed to demonstrate the utility of the platform, focusing on signals within social media (Reddit) data related to depression and their connection to various emergent themes within the data. Discussion and

Conclusion:

Our platform provides researchers with a readily available and inexpensive tool to parse large quantities of unstructured, noisy data into coherent themes, as well as identifying portions of the data related to the research TOI. While the development process for the platform was focused on mental health themes, we believe it to be generalizable to other domains of research as well.
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Idioma: En Ano de publicação: 2024 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Idioma: En Ano de publicação: 2024 Tipo de documento: Article