Búsqueda | Portal de Búsqueda de la BVS

Assessing the Utility of ChatGPT Throughout the Entire Clinical Workflow: Development and Usability Study.

Rao, Arya; Pang, Michael; Kim, John; Kamineni, Meghana; Lie, Winston; Prasad, Anoop K; Landman, Adam; Dreyer, Keith; Succi, Marc D.

J Med Internet Res ; 25: e48659, 2023 08 22.

Artículo en Inglés | MEDLINE | ID: mdl-37606976

RESUMEN

BACKGROUND: Large language model (LLM)-based artificial intelligence chatbots direct the power of large training data sets toward successive, related tasks as opposed to single-ask tasks, for which artificial intelligence already achieves impressive performance. The capacity of LLMs to assist in the full scope of iterative clinical reasoning via successive prompting, in effect acting as artificial physicians, has not yet been evaluated. OBJECTIVE: This study aimed to evaluate ChatGPT's capacity for ongoing clinical decision support via its performance on standardized clinical vignettes. METHODS: We inputted all 36 published clinical vignettes from the Merck Sharpe & Dohme (MSD) Clinical Manual into ChatGPT and compared its accuracy on differential diagnoses, diagnostic testing, final diagnosis, and management based on patient age, gender, and case acuity. Accuracy was measured by the proportion of correct responses to the questions posed within the clinical vignettes tested, as calculated by human scorers. We further conducted linear regression to assess the contributing factors toward ChatGPT's performance on clinical tasks. RESULTS: ChatGPT achieved an overall accuracy of 71.7% (95% CI 69.3%-74.1%) across all 36 clinical vignettes. The LLM demonstrated the highest performance in making a final diagnosis with an accuracy of 76.9% (95% CI 67.8%-86.1%) and the lowest performance in generating an initial differential diagnosis with an accuracy of 60.3% (95% CI 54.2%-66.6%). Compared to answering questions about general medical knowledge, ChatGPT demonstrated inferior performance on differential diagnosis (ß=-15.8%; P<.001) and clinical management (ß=-7.4%; P=.02) question types. CONCLUSIONS: ChatGPT achieves impressive accuracy in clinical decision-making, with increasing strength as it gains more clinical information at its disposal. In particular, ChatGPT demonstrates the greatest accuracy in tasks of final diagnosis as compared to initial diagnosis. Limitations include possible model hallucinations and the unclear composition of ChatGPT's training data set.

Asunto(s)

Inteligencia Artificial , Humanos , Toma de Decisiones Clínicas , Organizaciones , Flujo de Trabajo , Diseño Centrado en el Usuario

Minimum 7-Year Follow-Up of Vitamin E-Diffused and Highly Cross-Linked Polyethylene Liners in Total Hip Arthroplasty: Findings From a Prospective, International, Multicenter Study of 977 Patients.

Collins, Austin K; Sauder, Nicholas; Nepple, Cecilia M; Blackburn, Amy Z; Prasad, Anoop K; Feder, Oren I; Melnic, Christopher M; Bedair, Hany S.

J Arthroplasty ; 38(11): 2373-2378, 2023 11.

Artículo en Inglés | MEDLINE | ID: mdl-37207702

RESUMEN

BACKGROUND: Vitamin E-diffused highly cross-linked polyethylene (VEPE) acetabular liners for total hip arthroplasty (THA) have shown favorable results in small cohort studies. However, larger studies are warranted to compare its performance to highly cross-linked polyethylene (XLPE) and demonstrate clinical significance in 10-year arthroplasty outcomes. This study compared acetabular liner wear and patient-reported outcome measures (PROMs) between patients treated with VEPE and XLPE liners in a prospective, international, multicenter study with minimum 7-year follow-up. METHODS: A total of 977 patients (17 centers; 8 countries) were enrolled from 2007 to 2012. The centers were randomly assigned to implants. At 1-year, 3-year, 5-year, and 7-year postoperative visits, radiographs, PROMs, and incidence of revision were collected. Acetabular liner wear was calculated using computer-assisted vector analysis of serial radiographs. General health, disease progression, and treatment satisfaction reported by patients were scored using 5 validated surveys and compared using Mann-Whitney U tests. At 7 years, 75.4% of eligible patients submitted data. RESULTS: The mean acetabular liner wear rate was -0.009 mm/y and 0.024 mm/y for the VEPE and XLPE group, respectively (P = .01). There were no statistically significant differences in PROMs. The overall revision incidence was 1.8% (n = 18). The revision incidence in VEPE and XLPE cohorts were 1.92% (n = 10) versus 1.75% (n = 8), respectively. CONCLUSION: We found that VEPE acetabular liners in total hip arthroplasty led to no significant clinical difference in 7-year outcomes as measured by acetabular liner wear rate, PROMs, and revision rate. While VEPE liners showed less wear, the wear rate for both the VEPE and XLPE liners was below the threshold for osteolysis. Therefore, the difference in liner wear may indicate comparative clinical performance at 7 years, as further indicated by the lack of difference in PROMs and the low revision incidence.

Asunto(s)

Artroplastia de Reemplazo de Cadera , Prótesis de Cadera , Humanos , Artroplastia de Reemplazo de Cadera/efectos adversos , Polietileno , Vitamina E , Estudios de Seguimiento , Estudios Prospectivos , Falla de Prótesis , Diseño de Prótesis

Assessing the Utility of ChatGPT Throughout the Entire Clinical Workflow.

Rao, Arya; Pang, Michael; Kim, John; Kamineni, Meghana; Lie, Winston; Prasad, Anoop K; Landman, Adam; Dreyer, Keith J; Succi, Marc D.

medRxiv ; 2023 Feb 26.

Artículo en Inglés | MEDLINE | ID: mdl-36865204

RESUMEN

IMPORTANCE: Large language model (LLM) artificial intelligence (AI) chatbots direct the power of large training datasets towards successive, related tasks, as opposed to single-ask tasks, for which AI already achieves impressive performance. The capacity of LLMs to assist in the full scope of iterative clinical reasoning via successive prompting, in effect acting as virtual physicians, has not yet been evaluated. OBJECTIVE: To evaluate ChatGPT's capacity for ongoing clinical decision support via its performance on standardized clinical vignettes. DESIGN: We inputted all 36 published clinical vignettes from the Merck Sharpe & Dohme (MSD) Clinical Manual into ChatGPT and compared accuracy on differential diagnoses, diagnostic testing, final diagnosis, and management based on patient age, gender, and case acuity. SETTING: ChatGPT, a publicly available LLM. PARTICIPANTS: Clinical vignettes featured hypothetical patients with a variety of age and gender identities, and a range of Emergency Severity Indices (ESIs) based on initial clinical presentation. EXPOSURES: MSD Clinical Manual vignettes. MAIN OUTCOMES AND MEASURES: We measured the proportion of correct responses to the questions posed within the clinical vignettes tested. RESULTS: ChatGPT achieved 71.7% (95% CI, 69.3% to 74.1%) accuracy overall across all 36 clinical vignettes. The LLM demonstrated the highest performance in making a final diagnosis with an accuracy of 76.9% (95% CI, 67.8% to 86.1%), and the lowest performance in generating an initial differential diagnosis with an accuracy of 60.3% (95% CI, 54.2% to 66.6%). Compared to answering questions about general medical knowledge, ChatGPT demonstrated inferior performance on differential diagnosis (ß=-15.8%, p<0.001) and clinical management (ß=-7.4%, p=0.02) type questions. CONCLUSIONS AND RELEVANCE: ChatGPT achieves impressive accuracy in clinical decision making, with particular strengths emerging as it has more clinical information at its disposal.

Cemented vs. cementless fixation in primary total knee arthroplasty: a systematic review and meta-analysis.

Prasad, Anoop K; Tan, Jaimee H S; Bedair, Hany S; Dawson-Bowling, Sebastian; Hanna, Sammy A.

EFORT Open Rev ; 5(11): 793-798, 2020 Nov.

Artículo en Inglés | MEDLINE | ID: mdl-33312706

RESUMEN

Over 100,000 total knee replacements (TKRs) are carried out in the UK annually, with cemented fixation accounting for approximately 95% of all primary TKRs. In Australia, 68.1% of all primary TKRs use cemented fixation, and only 10.9% use cementless fixation. However, there has been a renewed interest in cementless fixation as a result of improvements in implant design and manufacturing technology.This meta-analysis aimed to compare the outcomes of cemented and cementless fixation in primary TKR. Outcome measures included the revision rate and patient-reported functional scores.MEDLINE and EMBASE were searched from the earliest available date to November 2018 for randomized controlled trials of primary TKAs comparing cemented versus cementless fixation outcomes.Six studies met our inclusion criteria and were analysed. A total of 755 knees were included; 356 knees underwent cemented fixation, 399 underwent cementless fixation. They were followed up for an average of 8.4 years (range: 2.0 to 16.6).This study found no significant difference in revision rates and knee function in cemented versus cementless TKR at up to 16.6-year follow-up. Cite this article: EFORT Open Rev 2020;5:793-798. DOI: 10.1302/2058-5241.5.200030.

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA