Pesquisa | Portal de Pesquisa da BVS Enfermagem

1.

Adherence of a Large Language Model to Clinical Guidelines for Craniofacial Plastic and Reconstructive Surgeries.

Kwon, Daniel Y; Wang, Anya; Mejia, Mateo Restrepo; Saturno, Michael P; Oleru, Olachi; Seyidova, Nargiz; Taub, Peter J.

Ann Plast Surg ; 92(3): 261-262, 2024 Mar 01.

Artigo em Inglês | MEDLINE | ID: mdl-38319985

Assuntos

Procedimentos de Cirurgia Plástica , Cirurgia Plástica , Humanos

2.

ChatGPT and its Role in the Decision-Making for the Diagnosis and Treatment of Lumbar Spinal Stenosis: A Comparative Analysis and Narrative Review.

Rajjoub, Rami; Arroyave, Juan Sebastian; Zaidat, Bashar; Ahmed, Wasil; Mejia, Mateo Restrepo; Tang, Justin; Kim, Jun S; Cho, Samuel K.

Global Spine J ; 14(3): 998-1017, 2024 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-37560946

RESUMO

STUDY DESIGN: Comparative Analysis and Narrative Review. OBJECTIVE: To assess and compare ChatGPT's responses to the clinical questions and recommendations proposed by The 2011 North American Spine Society (NASS) Clinical Guideline for the Diagnosis and Treatment of Degenerative Lumbar Spinal Stenosis (LSS). We explore the advantages and disadvantages of ChatGPT's responses through an updated literature review on spinal stenosis. METHODS: We prompted ChatGPT with questions from the NASS Evidence-based Clinical Guidelines for LSS and compared its generated responses with the recommendations provided by the guidelines. A review of the literature was performed via PubMed, OVID, and Cochrane on the diagnosis and treatment of lumbar spinal stenosis between January 2012 and April 2023. RESULTS: 14 questions proposed by the NASS guidelines for LSS were uploaded into ChatGPT and directly compared to the responses offered by NASS. Three questions were on the definition and history of LSS, one on diagnostic tests, seven on non-surgical interventions and three on surgical interventions. The review process found 40 articles that were selected for inclusion that helped corroborate or contradict the responses that were generated by ChatGPT. CONCLUSIONS: ChatGPT's responses were similar to findings in the current literature on LSS. These results demonstrate the potential for implementing ChatGPT into the spine surgeon's workplace as a means of supporting the decision-making process for LSS diagnosis and treatment. However, our narrative summary only provides a limited literature review and additional research is needed to standardize our findings as means of validating ChatGPT's use in the clinical space.

3.

Use of ChatGPT for Determining Clinical and Surgical Treatment of Lumbar Disc Herniation With Radiculopathy: A North American Spine Society Guideline Comparison.

Mejia, Mateo Restrepo; Arroyave, Juan Sebastian; Saturno, Michael; Ndjonko, Laura Chelsea Mazudie; Zaidat, Bashar; Rajjoub, Rami; Ahmed, Wasil; Zapolsky, Ivan; Cho, Samuel K.

Neurospine ; 21(1): 149-158, 2024 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-38291746

RESUMO

OBJECTIVE: Large language models like chat generative pre-trained transformer (ChatGPT) have found success in various sectors, but their application in the medical field remains limited. This study aimed to assess the feasibility of using ChatGPT to provide accurate medical information to patients, specifically evaluating how well ChatGPT versions 3.5 and 4 aligned with the 2012 North American Spine Society (NASS) guidelines for lumbar disk herniation with radiculopathy. METHODS: ChatGPT's responses to questions based on the NASS guidelines were analyzed for accuracy. Three new categories-overconclusiveness, supplementary information, and incompleteness-were introduced to deepen the analysis. Overconclusiveness referred to recommendations not mentioned in the NASS guidelines, supplementary information denoted additional relevant details, and incompleteness indicated omitted crucial information from the NASS guidelines. RESULTS: Out of 29 clinical guidelines evaluated, ChatGPT-3.5 demonstrated accuracy in 15 responses (52%), while ChatGPT-4 achieved accuracy in 17 responses (59%). ChatGPT-3.5 was overconclusive in 14 responses (48%), while ChatGPT-4 exhibited overconclusiveness in 13 responses (45%). Additionally, ChatGPT-3.5 provided supplementary information in 24 responses (83%), and ChatGPT-4 provided supplemental information in 27 responses (93%). In terms of incompleteness, ChatGPT-3.5 displayed this in 11 responses (38%), while ChatGPT-4 showed incompleteness in 8 responses (23%). CONCLUSION: ChatGPT shows promise for clinical decision-making, but both patients and healthcare providers should exercise caution to ensure safety and quality of care. While these results are encouraging, further research is necessary to validate the use of large language models in clinical settings.

4.

Performance of a Large Language Model in the Generation of Clinical Guidelines for Antibiotic Prophylaxis in Spine Surgery.

Zaidat, Bashar; Shrestha, Nancy; Rosenberg, Ashley M; Ahmed, Wasil; Rajjoub, Rami; Hoang, Timothy; Mejia, Mateo Restrepo; Duey, Akiro H; Tang, Justin E; Kim, Jun S; Cho, Samuel K.

Neurospine ; 21(1): 128-146, 2024 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-38569639

RESUMO

OBJECTIVE: Large language models, such as chat generative pre-trained transformer (ChatGPT), have great potential for streamlining medical processes and assisting physicians in clinical decision-making. This study aimed to assess the potential of ChatGPT's 2 models (GPT-3.5 and GPT-4.0) to support clinical decision-making by comparing its responses for antibiotic prophylaxis in spine surgery to accepted clinical guidelines. METHODS: ChatGPT models were prompted with questions from the North American Spine Society (NASS) Evidence-based Clinical Guidelines for Multidisciplinary Spine Care for Antibiotic Prophylaxis in Spine Surgery (2013). Its responses were then compared and assessed for accuracy. RESULTS: Of the 16 NASS guideline questions concerning antibiotic prophylaxis, 10 responses (62.5%) were accurate in ChatGPT's GPT-3.5 model and 13 (81%) were accurate in GPT-4.0. Twenty-five percent of GPT-3.5 answers were deemed as overly confident while 62.5% of GPT-4.0 answers directly used the NASS guideline as evidence for its response. CONCLUSION: ChatGPT demonstrated an impressive ability to accurately answer clinical questions. GPT-3.5 model's performance was limited by its tendency to give overly confident responses and its inability to identify the most significant elements in its responses. GPT-4.0 model's responses had higher accuracy and cited the NASS guideline as direct evidence many times. While GPT-4.0 is still far from perfect, it has shown an exceptional ability to extract the most relevant research available compared to GPT-3.5. Thus, while ChatGPT has shown far-reaching potential, scrutiny should still be exercised regarding its clinical use at this time.

5.

Can Large Language Models (LLMs) Predict the Appropriate Treatment of Acute Hip Fractures in Older Adults? Comparing Appropriate Use Criteria With Recommendations From ChatGPT.

Nietsch, Katrina S; Shrestha, Nancy; Mazudie Ndjonko, Laura C; Ahmed, Wasil; Mejia, Mateo Restrepo; Zaidat, Bashar; Ren, Renee; Duey, Akiro H; Li, Samuel Q; Kim, Jun S; Hidden, Krystin A; Cho, Samuel K.

J Am Acad Orthop Surg Glob Res Rev ; 8(8)2024 Aug 01.

Artigo em Inglês | MEDLINE | ID: mdl-39137403

RESUMO

BACKGROUND: Acute hip fractures are a public health problem affecting primarily older adults. Chat Generative Pretrained Transformer may be useful in providing appropriate clinical recommendations for beneficial treatment. OBJECTIVE: To evaluate the accuracy of Chat Generative Pretrained Transformer (ChatGPT)-4.0 by comparing its appropriateness scores for acute hip fractures with the American Academy of Orthopaedic Surgeons (AAOS) Appropriate Use Criteria given 30 patient scenarios. "Appropriateness" indicates the unexpected health benefits of treatment exceed the expected negative consequences by a wide margin. METHODS: Using the AAOS Appropriate Use Criteria as the benchmark, numerical scores from 1 to 9 assessed appropriateness. For each patient scenario, ChatGPT-4.0 was asked to assign an appropriate score for six treatments to manage acute hip fractures. RESULTS: Thirty patient scenarios were evaluated for 180 paired scores. Comparing ChatGPT-4.0 with AAOS scores, there was a positive correlation for multiple cannulated screw fixation, total hip arthroplasty, hemiarthroplasty, and long cephalomedullary nails. Statistically significant differences were observed only between scores for long cephalomedullary nails. CONCLUSION: ChatGPT-4.0 scores were not concordant with AAOS scores, overestimating the appropriateness of total hip arthroplasty, hemiarthroplasty, and long cephalomedullary nails, and underestimating the other three. ChatGPT-4.0 was inadequate in selecting an appropriate treatment deemed acceptable, most reasonable, and most likely to improve patient outcomes.

Assuntos

Fraturas do Quadril , Humanos , Fraturas do Quadril/cirurgia , Idoso , Feminino , Masculino , Idoso de 80 Anos ou mais , Artroplastia de Quadril , Hemiartroplastia , Guias de Prática Clínica como Assunto , Doença Aguda , Idioma

6.

Generative artificial intelligence fails to provide sufficiently accurate recommendations when compared to established breast reconstruction surgery guidelines.

Saturno, Michael P; Mejia, Mateo Restrepo; Wang, Anya; Kwon, Daniel; Oleru, Olachi; Seyidova, Nargiz; Henderson, Peter W.

J Plast Reconstr Aesthet Surg ; 86: 248-250, 2023 11.

Artigo em Inglês | MEDLINE | ID: mdl-37793197

Assuntos

Inteligência Artificial , Mamoplastia , Humanos , Guias de Prática Clínica como Assunto

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA