Assessment of ChatGPT-3.5's Knowledge in Oncology: Comparative Study with ASCO-SEP Benchmarks.

Odabashian, Roupen; Bastin, Donald; Jones, Georden; Manzoor, Maria; Tangestaniapour, Sina; Assad, Malke; Lakhani, Sunita; Odabashian, Maritsa; McGee, Sharon

Odabashian, Roupen; Bastin, Donald; Jones, Georden; Manzoor, Maria; Tangestaniapour, Sina; Assad, Malke; Lakhani, Sunita; Odabashian, Maritsa; McGee, Sharon.

Afiliação

Odabashian R; Department of Oncology, Barbara Ann Karmanos Cancer Institute, Wayne State University, Detroit, MI, United States.
Bastin D; Department of Medicine, Division of Internal Medicine, The Ottawa Hospital and the University of Ottawa, Ottawa, ON, Canada.
Jones G; Mary A Rackham Institute, University of Michigan, Ann Arbor, MI, United States.
Assad M; Department of Plastic Surgery, University of Pittsburgh Medical Center, Pittsburgh, PA, United States.
Lakhani S; Department of Medicine, Division of Internal Medicine, Jefferson Abington Hospital, Philadelphia, PA, United States.
Odabashian M; Mary A Rackham Institute, University of Michigan, Ann Arbor, MI, United States.
McGee S; The Ottawa Hospital Research Institute, Ottawa, ON, Canada.

JMIR AI ; 3: e50442, 2024 Jan 12.

Article em En | MEDLINE | ID: mdl-38875575

ABSTRACT

ABSTRACT

BACKGROUND:

ChatGPT (Open AI) is a state-of-the-art large language model that uses artificial intelligence (AI) to address questions across diverse topics. The American Society of Clinical Oncology Self-Evaluation Program (ASCO-SEP) created a comprehensive educational program to help physicians keep up to date with the many rapid advances in the field. The question bank consists of multiple choice questions addressing the many facets of cancer care, including diagnosis, treatment, and supportive care. As ChatGPT applications rapidly expand, it becomes vital to ascertain if the knowledge of ChatGPT-3.5 matches the established standards that oncologists are recommended to follow.

OBJECTIVE:

This study aims to evaluate whether ChatGPT-3.5's knowledge aligns with the established benchmarks that oncologists are expected to adhere to. This will furnish us with a deeper understanding of the potential applications of this tool as a support for clinical decision-making.

METHODS:

We conducted a systematic assessment of the performance of ChatGPT-3.5 on the ASCO-SEP, the leading educational and assessment tool for medical oncologists in training and practice. Over 1000 multiple choice questions covering the spectrum of cancer care were extracted. Questions were categorized by cancer type or discipline, with subcategorization as treatment, diagnosis, or other. Answers were scored as correct if ChatGPT-3.5 selected the answer as defined by ASCO-SEP.

RESULTS:

Overall, ChatGPT-3.5 achieved a score of 56.1% (583/1040) for the correct answers provided. The program demonstrated varying levels of accuracy across cancer types or disciplines. The highest accuracy was observed in questions related to developmental therapeutics (8/10; 80% correct), while the lowest accuracy was observed in questions related to gastrointestinal cancer (102/209; 48.8% correct). There was no significant difference in the program's performance across the predefined subcategories of diagnosis, treatment, and other (P=.16, which is greater than .05).

CONCLUSIONS:

This study evaluated ChatGPT-3.5's oncology knowledge using the ASCO-SEP, aiming to address uncertainties regarding AI tools like ChatGPT in clinical decision-making. Our findings suggest that while ChatGPT-3.5 offers a hopeful outlook for AI in oncology, its present performance in ASCO-SEP tests necessitates further refinement to reach the requisite competency levels. Future assessments could explore ChatGPT's clinical decision support capabilities with real-world clinical scenarios, its ease of integration into medical workflows, and its potential to foster interdisciplinary collaboration and patient engagement in health care settings.

Palavras-chave

ChatGPT-3.5; artificial intelligence; language model; medical oncology

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Ano de publicação: 2024 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Ano de publicação: 2024 Tipo de documento: Article