A comparative evaluation of ChatGPT 3.5 and ChatGPT 4 in responses to selected genetics questions.

McGrath, Scott P; Kozel, Beth A; Gracefo, Sara; Sutherland, Nykole; Danford, Christopher J; Walton, Nephi

McGrath, Scott P; Kozel, Beth A; Gracefo, Sara; Sutherland, Nykole; Danford, Christopher J; Walton, Nephi.

Afiliação

McGrath SP; CITRIS Health, University of California Berkeley, Berkeley, CA 94720-1764, United States.
Kozel BA; Laboratory of Vascular and Matrix Genetics, National Heart, Lung, and Blood Institute (NHLBI), Bethesda, MD 20892, United States.
Gracefo S; Intermountain Precision Genomics, Intermountain Healthcare, St George, UT 84790-8723, United States.
Sutherland N; Intermountain Precision Genomics, Intermountain Healthcare, St George, UT 84790-8723, United States.
Danford CJ; Transplant Services, Intermountain Medical CenterMurray, UT 84107, United States.
Walton N; National Human Genome Research Institute, National Institute of Health, Bethesda, MD 20892-2152, United States.

J Am Med Inform Assoc ; 2024 Jun 14.

Article em En | MEDLINE | ID: mdl-38872284

ABSTRACT

ABSTRACT

OBJECTIVES:

To evaluate the efficacy of ChatGPT 4 (GPT-4) in delivering genetic information about BRCA1, HFE, and MLH1, building on previous findings with ChatGPT 3.5 (GPT-3.5). To focus on assessing the utility, limitations, and ethical implications of using ChatGPT in medical settings. MATERIALS AND

METHODS:

A structured survey was developed to assess GPT-4's clinical value. An expert panel of genetic counselors and clinical geneticists evaluated GPT-4's responses to these questions. We also performed comparative analysis with GPT-3.5, utilizing descriptive statistics and using Prism 9 for data analysis.

RESULTS:

The findings indicate improved accuracy in GPT-4 over GPT-3.5 (P < .0001). However, notable errors in accuracy remained. The relevance of responses varied in GPT-4, but was generally favorable, with a mean in the "somewhat agree" range. There was no difference in performance by disease category. The 7-question subset of the Bot Usability Scale (BUS-15) showed no statistically significant difference between the groups but trended lower in the GPT-4 version. DISCUSSION AND

CONCLUSION:

The study underscores GPT-4's potential role in genetic education, showing notable progress yet facing challenges like outdated information and the necessity of ongoing refinement. Our results, while showing promise, emphasizes the importance of balancing technological innovation with ethical responsibility in healthcare information delivery.

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Ano de publicação: 2024 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Ano de publicação: 2024 Tipo de documento: Article