Exploring the Efficacy of Large Language Models in Summarizing Mental Health Counseling Sessions: Benchmark Study.

Adhikary, Prottay Kumar; Srivastava, Aseem; Kumar, Shivani; Singh, Salam Michael; Manuja, Puneet; Gopinath, Jini K; Krishnan, Vijay; Gupta, Swati Kedia; Deb, Koushik Sinha; Chakraborty, Tanmoy

Adhikary, Prottay Kumar; Srivastava, Aseem; Kumar, Shivani; Singh, Salam Michael; Manuja, Puneet; Gopinath, Jini K; Krishnan, Vijay; Gupta, Swati Kedia; Deb, Koushik Sinha; Chakraborty, Tanmoy.

Afiliação

Adhikary PK; Department of Electrical Engineering, Indian Institute of Technology Delhi, New Delhi, India.
Srivastava A; Department of Computer Science & Engineering, Indraprastha Institute of Information Technology Delhi, New Delhi, India.
Kumar S; Department of Computer Science & Engineering, Indraprastha Institute of Information Technology Delhi, New Delhi, India.
Singh SM; Department of Electrical Engineering, Indian Institute of Technology Delhi, New Delhi, India.
Manuja P; YourDOST, Karnataka, India.
Gopinath JK; YourDOST, Karnataka, India.
Krishnan V; Department of Psychiatry, All India Institute of Medical Sciences, Rishikesh, India.
Gupta SK; Department of Psychiatry, All India Institute of Medical Sciences, New Delhi, India.
Deb KS; Department of Psychiatry, All India Institute of Medical Sciences, New Delhi, India.
Chakraborty T; Department of Electrical Engineering, Indian Institute of Technology Delhi, New Delhi, India.

JMIR Ment Health ; 11: e57306, 2024 Jul 23.

Article em En | MEDLINE | ID: mdl-39042893

ABSTRACT

ABSTRACT

BACKGROUND:

Comprehensive session summaries enable effective continuity in mental health counseling, facilitating informed therapy planning. However, manual summarization presents a significant challenge, diverting experts' attention from the core counseling process. Leveraging advances in automatic summarization to streamline the summarization process addresses this issue because this enables mental health professionals to access concise summaries of lengthy therapy sessions, thereby increasing their efficiency. However, existing approaches often overlook the nuanced intricacies inherent in counseling interactions.

OBJECTIVE:

This study evaluates the effectiveness of state-of-the-art large language models (LLMs) in selectively summarizing various components of therapy sessions through aspect-based summarization, aiming to benchmark their performance.

METHODS:

We first created Mental Health Counseling-Component-Guided Dialogue Summaries, a benchmarking data set that consists of 191 counseling sessions with summaries focused on 3 distinct counseling components (also known as counseling aspects). Next, we assessed the capabilities of 11 state-of-the-art LLMs in addressing the task of counseling-component-guided summarization. The generated summaries were evaluated quantitatively using standard summarization metrics and verified qualitatively by mental health professionals.

RESULTS:

Our findings demonstrated the superior performance of task-specific LLMs such as MentalLlama, Mistral, and MentalBART evaluated using standard quantitative metrics such as Recall-Oriented Understudy for Gisting Evaluation (ROUGE)-1, ROUGE-2, ROUGE-L, and Bidirectional Encoder Representations from Transformers Score across all aspects of the counseling components. Furthermore, expert evaluation revealed that Mistral superseded both MentalLlama and MentalBART across 6 parameters affective attitude, burden, ethicality, coherence, opportunity costs, and perceived effectiveness. However, these models exhibit a common weakness in terms of room for improvement in the opportunity costs and perceived effectiveness metrics.

CONCLUSIONS:

While LLMs fine-tuned specifically on mental health domain data display better performance based on automatic evaluation scores, expert assessments indicate that these models are not yet reliable for clinical application. Further refinement and validation are necessary before their implementation in practice.

Assuntos

Benchmarking; Aconselhamento; Humanos; Aconselhamento/métodos; Adulto; Transtornos Mentais/terapia; Feminino

Palavras-chave

AI; artificial intelligence; counseling summarization; digital health; large language models; mental health

Texto completo

Adicionar na Minha BVS

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Benchmarking / Aconselhamento Limite: Adult / Female / Humans Idioma: En Revista: JMIR Ment Health Ano de publicação: 2024 Tipo de documento: Article País de afiliação: Índia País de publicação: Canadá

Texto completo

Adicionar na Minha BVS

Imprimir

XML

PubMed Links

Buscar no Google