Your browser doesn't support javascript.
loading
Evaluation of ChatGPT as a Counselling Tool for Italian-Speaking MASLD Patients: Assessment of Accuracy, Completeness and Comprehensibility.
Pugliese, Nicola; Polverini, Davide; Lombardi, Rosa; Pennisi, Grazia; Ravaioli, Federico; Armandi, Angelo; Buzzetti, Elena; Dalbeni, Andrea; Liguori, Antonio; Mantovani, Alessandro; Villani, Rosanna; Gardini, Ivan; Hassan, Cesare; Valenti, Luca; Miele, Luca; Petta, Salvatore; Sebastiani, Giada; Aghemo, Alessio.
Afiliação
  • Pugliese N; Department of Biomedical Sciences, Humanitas University, Pieve Emanuele, 20072 Milan, Italy.
  • Polverini D; Division of Internal Medicine and Hepatology, Department of Gastroenterology, IRCCS Humanitas Research Hospital, Rozzano, 20089 Milan, Italy.
  • Lombardi R; Department of Biomedical Sciences, Humanitas University, Pieve Emanuele, 20072 Milan, Italy.
  • Pennisi G; Division of Internal Medicine and Hepatology, Department of Gastroenterology, IRCCS Humanitas Research Hospital, Rozzano, 20089 Milan, Italy.
  • Ravaioli F; Unit of Internal Medicine and Metabolic Disease, Fondazione IRCCS Ca' Granda Ospedale Maggiore Policlinico of Milan, 20122 Milan, Italy.
  • Armandi A; Department of Pathophysiology and Transplantation, Università degli Studi di Milano, 20122 Milan, Italy.
  • Buzzetti E; Section of Gastroenterology and Hepatology, PROMISE, University of Palermo, 90127 Palermo, Italy.
  • Dalbeni A; Department of Medical and Surgical Sciences (DIMEC), University of Bologna, 40138 Bologna, Italy.
  • Liguori A; Division of Internal Medicine, Hepatobiliary and Immunoallergic Diseases, IRCCS Azienda Ospedaliero Universitaria di Bologna, 40138 Bologna, Italy.
  • Mantovani A; Division of Gastroenterology and Hepatology, Department of Medical Sciences, University of Turin, Corso Dogliotti 14, 10126 Turin, Italy.
  • Villani R; Metabolic Liver Disease Research Program, I. Department of Internal Medicine, University Medical Center of Mainz, 55131 Mainz, Germany.
  • Gardini I; Internal Medicine and Centre for Hemochromatosis and Hereditary Liver Diseases, ERN-EuroBloodNet Center for Iron Disorders, Azienda Ospedaliero-Universitaria di Modena-Policlinico, 41125 Modena, Italy.
  • Hassan C; Department of Medical and Surgical Sciences, Università degli Studi di Modena e Reggio Emilia, 41125 Modena, Italy.
  • Valenti L; Division of General Medicine C, Department of Medicine, University and Azienda Ospedaliera Universitaria Integrata of Verona, University of Verona, 37134 Verona, Italy.
  • Miele L; Liver Unit, Department of Medicine, University and Azienda Ospedaliera Universitaria Integrata of Verona, University of Verona, 37134 Verona, Italy.
  • Petta S; DiSMeC-Department of Scienze Mediche e Chirurgiche, Fondazione Policlinico Gemelli IRCCS, 00168 Rome, Italy.
  • Sebastiani G; Section of Endocrinology, Diabetes and Metabolism, Department of Medicine, University and Azienda Ospedaliera Universitaria Integrata of Verona, Piazzale Stefani, 37126 Verona, Italy.
  • Aghemo A; C.U.R.E. (University Center for Liver Disease Research and Treatment), Liver Unit, Department of Medical and Surgical Sciences, University of Foggia, 71122 Foggia, Italy.
  • Nafld Expert Chatbot Working Group; EpaC Onlus, Italian Liver Patient Association, 10141 Turin, Italy.
J Pers Med ; 14(6)2024 May 26.
Article em En | MEDLINE | ID: mdl-38929789
ABSTRACT

BACKGROUND:

Artificial intelligence (AI)-based chatbots have shown promise in providing counseling to patients with metabolic dysfunction-associated steatotic liver disease (MASLD). While ChatGPT3.5 has demonstrated the ability to comprehensively answer MASLD-related questions in English, its accuracy remains suboptimal. Whether language influences these results is unclear. This study aims to assess ChatGPT's performance as a counseling tool for Italian MASLD patients.

METHODS:

Thirteen Italian experts rated the accuracy, completeness and comprehensibility of ChatGPT3.5 in answering 15 MASLD-related questions in Italian using a six-point accuracy, three-point completeness and three-point comprehensibility Likert's scale.

RESULTS:

Mean scores for accuracy, completeness and comprehensibility were 4.57 ± 0.42, 2.14 ± 0.31 and 2.91 ± 0.07, respectively. The physical activity domain achieved the highest mean scores for accuracy and completeness, whereas the specialist referral domain achieved the lowest. Overall, Fleiss's coefficient of concordance for accuracy, completeness and comprehensibility across all 15 questions was 0.016, 0.075 and -0.010, respectively. Age and academic role of the evaluators did not influence the scores. The results were not significantly different from our previous study focusing on English.

CONCLUSION:

Language does not appear to affect ChatGPT's ability to provide comprehensible and complete counseling to MASLD patients, but accuracy remains suboptimal in certain domains.
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: J Pers Med Ano de publicação: 2024 Tipo de documento: Article País de afiliação: Itália

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: J Pers Med Ano de publicação: 2024 Tipo de documento: Article País de afiliação: Itália