Pesquisa | Portal Regional da BVS

Decoding the NCCN Guidelines With AI: A Comparative Evaluation of ChatGPT-4.0 and Llama 2 in the Management of Thyroid Carcinoma.

Pandya, Shivam; Bresler, Tamir E; Wilson, Tyler; Htway, Zin; Fujita, Manabu.

Am Surg ; : 31348241269430, 2024 Aug 13.

Artigo em Inglês | MEDLINE | ID: mdl-39136578

RESUMO

INTRODUCTION: Artificial Intelligence (AI) has emerged as a promising tool in the delivery of health care. ChatGPT-4.0 (OpenAI, San Francisco, California) and Llama 2 (Meta, Menlo Park, CA) have each gained attention for their use in various medical applications. OBJECTIVE: This study aims to evaluate and compare the effectiveness of ChatGPT-4.0 and Llama 2 in assisting with complex clinical decision making in the diagnosis and treatment of thyroid carcinoma. PARTICIPANTS: We reviewed the National Comprehensive Cancer Network® (NCCN) Clinical Practice Guidelines for the management of thyroid carcinoma and formulated up to 3 complex clinical questions for each decision-making page. ChatGPT-4.0 and Llama 2 were queried in a reproducible manner. The answers were scored on a Likert scale: 5) Correct; 4) correct, with missing information requiring clarification; 3) correct, but unable to complete answer; 2) partially incorrect; 1) absolutely incorrect. Score frequencies were compared, and subgroup analysis was conducted on Correctness (defined as scores 1-2 vs 3-5) and Accuracy (scores 1-3 vs 4-5). RESULTS: In total, 58 pages of the NCCN Guidelines® were analyzed, generating 167 unique questions. There was no statistically significant difference between ChatGPT-4.0 and Llama 2 in terms of overall score (Mann-Whitney U-test; Mean Rank = 160.53 vs 174.47, P = 0.123), Correctness (P = 0.177), or Accuracy (P = 0.891).[Formula: see text]. CONCLUSION: ChatGPT-4.0 and Llama 2 demonstrate a limited but substantial capacity to assist with complex clinical decision making relating to the management of thyroid carcinoma, with no significant difference in their effectiveness.

From Bytes to Best Practices: Tracing ChatGPT-3.5's Evolution and Alignment With the National Comprehensive Cancer Network® Guidelines in Pancreatic Adenocarcinoma Management.

Bresler, Tamir E; Pandya, Shivam; Meyer, Ryan; Htway, Zin; Fujita, Manabu.

Am Surg ; 90(10): 2543-2547, 2024 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-38666297

RESUMO

INTRODUCTION: Artificial intelligence continues to play an increasingly important role in modern health care. ChatGPT-3.5 (OpenAI, San Francisco, CA) has gained attention for its potential impact in this domain. OBJECTIVE: To explore the role of ChatGPT-3.5 in guiding clinical decision-making specifically in the context of pancreatic adenocarcinoma and to assess its growth over a period of time. PARTICIPANTS: We reviewed the National Comprehensive Cancer Network® (NCCN) Clinical Practice Guidelines for the Management of Pancreatic Adenocarcinoma and formulated a complex clinical question for each decision-making page. ChatGPT-3.5 was queried in a reproducible fashion. We scored answers on the following Likert scale: 5) Correct; 4) Correct, with missing information requiring clarification; 3) Correct, but unable to complete answer; 2) Partially incorrect; 1) Absolutely incorrect. We repeated this protocol at 3-months. Score frequencies were compared, and subgroup analysis was conducted on Correctness (defined as scores 1-2 vs 3-5) and Accuracy (scores 1-3 vs 4-5). RESULTS: In total, 50-pages of the NCCN Guidelines® were analyzed, generating 50 complex clinical questions. On subgroup analysis, the percentage of Acceptable answers improved from 60% to 76%. The score improvement was statistically significant (Mann-Whitney U-test; Mean Rank = 44.52 vs 56.48, P = .027). CONCLUSION: ChatGPT-3.5 represents an interesting but limited tool for assistance in clinical decision-making. We demonstrate that the platform evolved, and its responses to our standardized questions improved over a relatively short period (3-months). Future research is needed to determine the validity of this tool for this clinical application.

Assuntos

Adenocarcinoma , Neoplasias Pancreáticas , Guias de Prática Clínica como Assunto , Humanos , Neoplasias Pancreáticas/terapia , Adenocarcinoma/terapia , Tomada de Decisão Clínica , Inteligência Artificial

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA