Pesquisa | Biblioteca Virtual em Saúde

Inadequate Performance of ChatGPT on Orthopedic Board-Style Written Exams.

Sparks, Chandler A; Kraeutler, Matthew J; Chester, Grace A; Contrada, Edward V; Zhu, Eric; Fasulo, Sydney M; Scillia, Anthony J.

Cureus ; 16(6): e62643, 2024 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-39036109

RESUMO

BACKGROUND: Chat Generative Pre-Trained Transformer (ChatGPT) is an artificial intelligence (AI) chatbot capable of delivering human-like responses to a seemingly infinite number of inquiries. For the technology to perform certain healthcare-related tasks or act as a study aid, the technology should have up-to-date knowledge and the ability to reason through medical information. The purpose of this study was to assess the orthopedic knowledge and reasoning ability of ChatGPT by querying it with orthopedic board-style questions. METHODOLOGY: We queried ChatGPT (GPT-3.5) with a total of 472 questions from the Orthobullets dataset (n = 239), the 2022 Orthopaedic In-Training Examination (OITE) (n = 124), and the 2021 OITE (n = 109). The importance, difficulty, and category were recorded for questions from the Orthobullets question bank. Responses were assessed for answer choice correctness if the explanation given matched that of the dataset, answer integrity, and reason for incorrectness. RESULTS: ChatGPT correctly answered 55.9% (264/472) of questions and, of those answered correctly, gave an explanation that matched that of the dataset for 92.8% (245/264) of the questions. The chatbot used information internal to the question in all responses (100%) and used information external to the question (98.3%) as well as logical reasoning (96.4%) in most responses. There was no significant difference in the proportion of questions answered correctly between the datasets (P = 0.62). There was no significant difference in the proportion of questions answered correctly by question category (P = 0.67), importance (P = 0.95), or difficulty (P = 0.87) within the Orthobullets dataset questions. ChatGPT mostly got questions incorrect due to information error (i.e., failure to identify the information required to answer the question) (81.7% of incorrect responses). CONCLUSIONS: ChatGPT performs below a threshold likely to pass the American Board of Orthopedic Surgery (ABOS) Part I written exam. The chatbot's performance on the 2022 and 2021 OITEs was between the average performance of an intern and to second-year resident. A major limitation of the current model is the failure to identify the information required to correctly answer the questions.

ChatGPT Is Moderately Accurate in Providing a General Overview of Orthopaedic Conditions.

Sparks, Chandler A; Fasulo, Sydney M; Windsor, Jordan T; Bankauskas, Vita; Contrada, Edward V; Kraeutler, Matthew J; Scillia, Anthony J.

JB JS Open Access ; 9(2)2024.

Artigo em Inglês | MEDLINE | ID: mdl-38912370

RESUMO

Background: ChatGPT is an artificial intelligence chatbot capable of providing human-like responses for virtually every possible inquiry. This advancement has provoked public interest regarding the use of ChatGPT, including in health care. The purpose of the present study was to investigate the quantity and accuracy of ChatGPT outputs for general patient-focused inquiries regarding 40 orthopaedic conditions. Methods: For each of the 40 conditions, ChatGPT (GPT-3.5) was prompted with the text "I have been diagnosed with [condition]. Can you tell me more about it?" The numbers of treatment options, risk factors, and symptoms given for each condition were compared with the number in the corresponding American Academy of Orthopaedic Surgeons (AAOS) OrthoInfo website article for information quantity assessment. For accuracy assessment, an attending orthopaedic surgeon ranked the outputs in the categories of <50%, 50% to 74%, 75% to 99%, and 100% accurate. An orthopaedics sports medicine fellow also independently ranked output accuracy. Results: Compared with the AAOS OrthoInfo website, ChatGPT provided significantly fewer treatment options (mean difference, -2.5; p < 0.001) and risk factors (mean difference, -1.1; p = 0.02) but did not differ in the number of symptoms given (mean difference, -0.5; p = 0.31). The surgical treatment options given by ChatGPT were often nondescript (n = 20 outputs), such as "surgery" as the only operative treatment option. Regarding accuracy, most conditions (26 of 40; 65%) were ranked as mostly (75% to 99%) accurate, with the others (14 of 40; 35%) ranked as moderately (50% to 74%) accurate, by an attending surgeon. Neither surgeon ranked any condition as mostly inaccurate (<50% accurate). Interobserver agreement between accuracy ratings was poor (κ = 0.03; p = 0.30). Conclusions: ChatGPT provides at least moderately accurate outputs for general inquiries of orthopaedic conditions but is lacking in the quantity of information it provides for risk factors and treatment options. Professional organizations, such as the AAOS, are the preferred source of musculoskeletal information when compared with ChatGPT. Clinical Relevance: ChatGPT is an emerging technology with potential roles and limitations in patient education that are still being explored.

Patients Undergoing Postless Hip Arthroscopy Demonstrate Significantly Better Patient-Reported Outcomes and Clinically Significant Outcomes Compared to Conventional Post-Assisted Hip Arthroscopy at Short-Term Follow-Up.

Kraeutler, Matthew J; Marder, Ryan S; Fasulo, Sydney M; Dávila Castrodad, Iciar M; Mei-Dan, Omer; Scillia, Anthony J.

Arthroscopy ; 2024 Apr 09.

Artigo em Inglês | MEDLINE | ID: mdl-38599534

RESUMO

PURPOSE: To prospectively compare the short-term clinical outcomes of patients undergoing hip arthroscopy with versus without the use of a perineal post. METHODS: A prospective, single-surgeon cohort study was performed on a subset of patients undergoing hip arthroscopy between 2020 and 2022. A post-free hip distraction system was used at 1 center at which the senior author operates, and a perineal post was used at another surgical location. An electronic survey of patient-reported outcome measures (PROMs) was completed by each patient at a minimum of 1 year postoperatively. PROMs included a visual analog scale for pain; University of California, Los Angeles (UCLA) Activity Scale; modified Harris Hip Score (mHHS); Hip Outcome Score-Sports-Specific Subscale (HOS-SSS); and a Single Assessment Numeric Evaluation. Postoperative scores and clinically significant outcomes, including the minimal clinically important difference, substantial clinical benefit, and patient acceptable symptom state, for each PROM were compared between groups. RESULTS: Sixty-nine patients were reached for follow-up (41 post, 28 postless) of 87 patients eligible for the study (79%). No significant differences were found between groups in terms of sex (post: 61% female, postless: 54% female, P = .54), age (post: 34 years, postless: 29 years, P = .11), body mass index (post: 26, postless: 24, P = .23), or follow-up duration (post: 24.4 months, postless: 21.3 months, P = .16). There was a significantly higher visual analog scale (3.1 vs 1.4, P = .01), a significantly lower UCLA Activity Scale score (7.0 vs 8.4, P = .02), and a significantly lower mHHS (73.7 vs 82.2, P = .03) in the post-assisted group. A significantly higher proportion of patients in the postless group achieved a patient acceptable symptom state for the UCLA (89.3% vs 68.3%, P = .04), mHHS (84.6% vs 61.0%, P = .04), and HOS-SSS (84.0% vs 61.0%, P = .048) and a substantial clinical benefit for HOS-SSS (72.0% vs 41.5%, P = .02). One patient (2.6%) in the post group underwent revision hip arthroscopy, and another was indicated for total hip arthroplasty by the time of follow-up. CONCLUSIONS: Postless hip arthroscopy may result in better clinical outcomes compared with post-assisted hip arthroscopy. LEVEL OF EVIDENCE: Level III, retrospective cohort study.

COVID-19 and thrombotic thrombocytopenic purpura: A review of literature

Singh, Balraj; Kaur, Parminder; Mekheal, Erinie M; Fasulo, Sydney; Maroules, Michael.

Hematol., Transfus. Cell Ther. (Impr.) ; 43(4): 529-531, Oct.-Dec. 2021. tab

Artigo em Inglês | LILACS | ID: biblio-1350826

Assuntos

Humanos , Púrpura Trombocitopênica Trombótica , COVID-19

RESUMO

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA