Using Large Language Models to Understand Suicidality in a Social Media-Based Taxonomy of Mental Health Disorders: Linguistic Analysis of Reddit Posts.

Bauer, Brian; Norel, Raquel; Leow, Alex; Rached, Zad Abi; Wen, Bo; Cecchi, Guillermo

Bauer, Brian; Norel, Raquel; Leow, Alex; Rached, Zad Abi; Wen, Bo; Cecchi, Guillermo.

Afiliación

Bauer B; Department of Psychology, University of Georgia, Athens, GA, United States.
Norel R; Digital Health, IBM Research, New York, NY, United States.
Leow A; Department of Psychiatry, University of Illinois Chicago, Chicago, IL, United States.
Rached ZA; Department of Biomedical Engineering and Computer Science, University of Illinois Chicago, Chicago, IL, United States.
Wen B; College Louise Wegmann, Beirut, Lebanon.
Cecchi G; Digital Health, IBM Research, New York, NY, United States.

JMIR Ment Health ; 11: e57234, 2024 May 16.

Article en En | MEDLINE | ID: mdl-38771256

ABSTRACT

ABSTRACT

Background:

Rates of suicide have increased by over 35% since 1999. Despite concerted efforts, our ability to predict, explain, or treat suicide risk has not significantly improved over the past 50 years.

Objective:

The aim of this study was to use large language models to understand natural language use during public web-based discussions (on Reddit) around topics related to suicidality.

Methods:

We used large language model-based sentence embedding to extract the latent linguistic dimensions of user postings derived from several mental health-related subreddits, with a focus on suicidality. We then applied dimensionality reduction to these sentence embeddings, allowing them to be summarized and visualized in a lower-dimensional Euclidean space for further downstream analyses. We analyzed 2.9 million posts extracted from 30 subreddits, including r/SuicideWatch, between October 1 and December 31, 2022, and the same period in 2010.

Results:

Our results showed that, in line with existing theories of suicide, posters in the suicidality community (r/SuicideWatch) predominantly wrote about feelings of disconnection, burdensomeness, hopeless, desperation, resignation, and trauma. Further, we identified distinct latent linguistic dimensions (well-being, seeking support, and severity of distress) among all mental health subreddits, and many of the resulting subreddit clusters were in line with a statistically driven diagnostic classification system-namely, the Hierarchical Taxonomy of Psychopathology (HiTOP)-by mapping onto the proposed superspectra.

Conclusions:

Overall, our findings provide data-driven support for several language-based theories of suicide, as well as dimensional classification systems for mental health disorders. Ultimately, this novel combination of natural language processing techniques can assist researchers in gaining deeper insights about emotions and experiences shared on the web and may aid in the validation and refutation of different mental health theories.

Asunto(s)

Lingüística; Trastornos Mentales; Medios de Comunicación Sociales; Suicidio; Humanos; Medios de Comunicación Sociales/estadística & datos numéricos; Suicidio/psicología; Trastornos Mentales/psicología; Trastornos Mentales/epidemiología; Trastornos Mentales/clasificación; Procesamiento de Lenguaje Natural

Palabras clave

AI; LLM; anxiety; artificial intelligence; depression; downstream analyses; explainable AI; explainable artificial intelligence; large language model; mental health; mental health disorder; mental health disorders; natural language processing; online; online discussions; social media; stress; suicide; trauma; web-based discussions

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Suicidio / Medios de Comunicación Sociales / Lingüística / Trastornos Mentales Límite: Humans Idioma: En Revista: JMIR Ment Health Año: 2024 Tipo del documento: Article País de afiliación: Estados Unidos

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google