[Efficacy and safety of artificial intelligence-based large language models for decision making support in herniology: evaluation by experts and general surgeons]. / Effektivnost' i bezopasnost' bol'shikh yazykovykh modelei na osnove iskusstvennogo intellekta v kachestve instrumenta podderzhki prinyatiya reshenii v gerniologii: otsenka ekspertami i obshchimi khirurgami.
Khirurgiia (Mosk)
; (8): 6-14, 2024.
Article
em Ru
| MEDLINE
| ID: mdl-39140937
ABSTRACT
OBJECTIVE:
To evaluate the quality of recommendations provided by ChatGPT regarding inguinal hernia repair. MATERIAL ANDMETHODS:
ChatGPT was asked 5 questions about surgical management of inguinal hernias. The chat-bot was assigned the role of expert in herniology and requested to search only specialized medical databases and provide information about references and evidence. Herniology experts and surgeons (non-experts) rated the quality of recommendations generated by ChatGPT using 4-point scale (from 0 to 3 points). Statistical correlations were explored between participants' ratings and their stance regarding artificial intelligence.RESULTS:
Experts scored the quality of ChatGPT responses lower than non-experts (2 (1-2) vs. 2 (2-3), p<0.001). The chat-bot failed to provide valid references and actual evidence, as well as falsified half of references. Respondents were optimistic about the future of neural networks for clinical decision-making support. Most of them were against restricting their use in healthcare.CONCLUSION:
We would not recommend non-specialized large language models as a single or primary source of information for clinical decision making or virtual searching assistant.Palavras-chave
Texto completo:
1
Coleções:
01-internacional
Base de dados:
MEDLINE
Assunto principal:
Inteligência Artificial
/
Herniorrafia
Limite:
Humans
Idioma:
Ru
Revista:
Khirurgiia (Mosk)
Ano de publicação:
2024
Tipo de documento:
Article
País de afiliação:
Federação Russa
País de publicação:
Federação Russa