Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Mais filtros

Base de dados
Ano de publicação
Tipo de documento
Intervalo de ano de publicação
1.
J Med Syst ; 48(1): 41, 2024 Apr 18.
Artigo em Inglês | MEDLINE | ID: mdl-38632172

RESUMO

Polypharmacy remains an important challenge for patients with extensive medical complexity. Given the primary care shortage and the increasing aging population, effective polypharmacy management is crucial to manage the increasing burden of care. The capacity of large language model (LLM)-based artificial intelligence to aid in polypharmacy management has yet to be evaluated. Here, we evaluate ChatGPT's performance in polypharmacy management via its deprescribing decisions in standardized clinical vignettes. We inputted several clinical vignettes originally from a study of general practicioners' deprescribing decisions into ChatGPT 3.5, a publicly available LLM, and evaluated its capacity for yes/no binary deprescribing decisions as well as list-based prompts in which the model was prompted to choose which of several medications to deprescribe. We recorded ChatGPT responses to yes/no binary deprescribing prompts and the number and types of medications deprescribed. In yes/no binary deprescribing decisions, ChatGPT universally recommended deprescribing medications regardless of ADL status in patients with no overlying CVD history; in patients with CVD history, ChatGPT's answers varied by technical replicate. Total number of medications deprescribed ranged from 2.67 to 3.67 (out of 7) and did not vary with CVD status, but increased linearly with severity of ADL impairment. Among medication types, ChatGPT preferentially deprescribed pain medications. ChatGPT's deprescribing decisions vary along the axes of ADL status, CVD history, and medication type, indicating some concordance of internal logic between general practitioners and the model. These results indicate that specifically trained LLMs may provide useful clinical support in polypharmacy management for primary care physicians.


Assuntos
Doenças Cardiovasculares , Desprescrições , Clínicos Gerais , Humanos , Idoso , Polimedicação , Inteligência Artificial
2.
J Med Internet Res ; 25: e48659, 2023 08 22.
Artigo em Inglês | MEDLINE | ID: mdl-37606976

RESUMO

BACKGROUND: Large language model (LLM)-based artificial intelligence chatbots direct the power of large training data sets toward successive, related tasks as opposed to single-ask tasks, for which artificial intelligence already achieves impressive performance. The capacity of LLMs to assist in the full scope of iterative clinical reasoning via successive prompting, in effect acting as artificial physicians, has not yet been evaluated. OBJECTIVE: This study aimed to evaluate ChatGPT's capacity for ongoing clinical decision support via its performance on standardized clinical vignettes. METHODS: We inputted all 36 published clinical vignettes from the Merck Sharpe & Dohme (MSD) Clinical Manual into ChatGPT and compared its accuracy on differential diagnoses, diagnostic testing, final diagnosis, and management based on patient age, gender, and case acuity. Accuracy was measured by the proportion of correct responses to the questions posed within the clinical vignettes tested, as calculated by human scorers. We further conducted linear regression to assess the contributing factors toward ChatGPT's performance on clinical tasks. RESULTS: ChatGPT achieved an overall accuracy of 71.7% (95% CI 69.3%-74.1%) across all 36 clinical vignettes. The LLM demonstrated the highest performance in making a final diagnosis with an accuracy of 76.9% (95% CI 67.8%-86.1%) and the lowest performance in generating an initial differential diagnosis with an accuracy of 60.3% (95% CI 54.2%-66.6%). Compared to answering questions about general medical knowledge, ChatGPT demonstrated inferior performance on differential diagnosis (ß=-15.8%; P<.001) and clinical management (ß=-7.4%; P=.02) question types. CONCLUSIONS: ChatGPT achieves impressive accuracy in clinical decision-making, with increasing strength as it gains more clinical information at its disposal. In particular, ChatGPT demonstrates the greatest accuracy in tasks of final diagnosis as compared to initial diagnosis. Limitations include possible model hallucinations and the unclear composition of ChatGPT's training data set.


Assuntos
Inteligência Artificial , Humanos , Tomada de Decisão Clínica , Organizações , Fluxo de Trabalho , Design Centrado no Usuário
3.
Artigo em Inglês | MEDLINE | ID: mdl-38702066

RESUMO

BACKGROUND AND PURPOSE: Imaging stewardship in the emergency department (ED) is vital in ensuring patients receive optimized care. While suspected cord compression (CC) is a frequent indication for total spine MR imaging in the ED, the incidence of CC is low. Recently, our level 1 trauma center introduced a survey spine MR imaging protocol to evaluate for suspected CC while reducing examination time to avoid imaging overutilization. This study aims to evaluate the time savings, frequency of ordering patterns of the survey, and the symptoms and outcomes of patients undergoing the survey. MATERIALS AND METHODS: This retrospective study examined patients who received a survey spine MR imaging in the ED at our institution between 2018 and 2022. All examinations were performed on a 1.5T GE Healthcare scanner by using our institutional CC survey protocol, which includes sagittal T2WI and STIR sequences through the cervical, thoracic, and lumbar spine. Examinations were read by a blinded, board-certified neuroradiologist. RESULTS: A total of 2002 patients received a survey spine MR imaging protocol during the study period. Of these patients, 845 (42.2%, mean age 57 ± 19 years, 45% women) received survey spine MR imaging examinations for the suspicion of CC, and 120 patients (14.2% positivity rate) had radiographic CC. The survey spine MR imaging averaged 5 minutes and 50 seconds (79% faster than routine MR imaging). On multivariate analysis, trauma, back pain, lower extremity weakness, urinary or bowel incontinence, numbness, ataxia, and hyperreflexia were each independently associated with CC. Of the 120 patients with CC, 71 underwent emergent surgery, 20 underwent nonemergent surgery, and 29 were managed medically. CONCLUSIONS: The survey spine protocol was positive for CC in 14% of patients in our cohort and acquired at a 79% faster rate compared with routine total spine. Understanding the positivity rate of CC, the clinical symptoms that are most associated with CC, and the subsequent care management for patients presenting with suspected cord compression who received the survey spine MR imaging may better inform the broad adoption and subsequent utilization of survey imaging protocols in emergency settings to increase throughput, improve allocation of resources, and provide efficient care for patients with suspected CC.

4.
medRxiv ; 2023 Feb 07.
Artigo em Inglês | MEDLINE | ID: mdl-36798292

RESUMO

BACKGROUND: ChatGPT, a popular new large language model (LLM) built by OpenAI, has shown impressive performance in a number of specialized applications. Despite the rising popularity and performance of AI, studies evaluating the use of LLMs for clinical decision support are lacking. PURPOSE: To evaluate ChatGPT's capacity for clinical decision support in radiology via the identification of appropriate imaging services for two important clinical presentations: breast cancer screening and breast pain. MATERIALS AND METHODS: We compared ChatGPT's responses to the American College of Radiology (ACR) Appropriateness Criteria for breast pain and breast cancer screening. Our prompt formats included an open-ended (OE) format, where ChatGPT was asked to provide the single most appropriate imaging procedure, and a select all that apply (SATA) format, where ChatGPT was given a list of imaging modalities to assess. Scoring criteria evaluated whether proposed imaging modalities were in accordance with ACR guidelines. RESULTS: ChatGPT achieved an average OE score of 1.83 (out of 2) and a SATA average percentage correct of 88.9% for breast cancer screening prompts, and an average OE score of 1.125 (out of 2) and a SATA average percentage correct of 58.3% for breast pain prompts. CONCLUSION: Our results demonstrate the feasibility of using ChatGPT for radiologic decision making, with the potential to improve clinical workflow and responsible use of radiology services.

5.
J Am Coll Radiol ; 20(10): 990-997, 2023 10.
Artigo em Inglês | MEDLINE | ID: mdl-37356806

RESUMO

OBJECTIVE: Despite rising popularity and performance, studies evaluating the use of large language models for clinical decision support are lacking. Here, we evaluate ChatGPT (Generative Pre-trained Transformer)-3.5 and GPT-4's (OpenAI, San Francisco, California) capacity for clinical decision support in radiology via the identification of appropriate imaging services for two important clinical presentations: breast cancer screening and breast pain. METHODS: We compared ChatGPT's responses to the ACR Appropriateness Criteria for breast pain and breast cancer screening. Our prompt formats included an open-ended (OE) and a select all that apply (SATA) format. Scoring criteria evaluated whether proposed imaging modalities were in accordance with ACR guidelines. Three replicate entries were conducted for each prompt, and the average of these was used to determine final scores. RESULTS: Both ChatGPT-3.5 and ChatGPT-4 achieved an average OE score of 1.830 (out of 2) for breast cancer screening prompts. ChatGPT-3.5 achieved a SATA average percentage correct of 88.9%, compared with ChatGPT-4's average percentage correct of 98.4% for breast cancer screening prompts. For breast pain, ChatGPT-3.5 achieved an average OE score of 1.125 (out of 2) and a SATA average percentage correct of 58.3%, as compared with an average OE score of 1.666 (out of 2) and a SATA average percentage correct of 77.7%. DISCUSSION: Our results demonstrate the eventual feasibility of using large language models like ChatGPT for radiologic decision making, with the potential to improve clinical workflow and responsible use of radiology services. More use cases and greater accuracy are necessary to evaluate and implement such tools.


Assuntos
Neoplasias da Mama , Mastodinia , Radiologia , Humanos , Feminino , Neoplasias da Mama/diagnóstico por imagem , Tomada de Decisões
6.
medRxiv ; 2023 Feb 26.
Artigo em Inglês | MEDLINE | ID: mdl-36865204

RESUMO

IMPORTANCE: Large language model (LLM) artificial intelligence (AI) chatbots direct the power of large training datasets towards successive, related tasks, as opposed to single-ask tasks, for which AI already achieves impressive performance. The capacity of LLMs to assist in the full scope of iterative clinical reasoning via successive prompting, in effect acting as virtual physicians, has not yet been evaluated. OBJECTIVE: To evaluate ChatGPT's capacity for ongoing clinical decision support via its performance on standardized clinical vignettes. DESIGN: We inputted all 36 published clinical vignettes from the Merck Sharpe & Dohme (MSD) Clinical Manual into ChatGPT and compared accuracy on differential diagnoses, diagnostic testing, final diagnosis, and management based on patient age, gender, and case acuity. SETTING: ChatGPT, a publicly available LLM. PARTICIPANTS: Clinical vignettes featured hypothetical patients with a variety of age and gender identities, and a range of Emergency Severity Indices (ESIs) based on initial clinical presentation. EXPOSURES: MSD Clinical Manual vignettes. MAIN OUTCOMES AND MEASURES: We measured the proportion of correct responses to the questions posed within the clinical vignettes tested. RESULTS: ChatGPT achieved 71.7% (95% CI, 69.3% to 74.1%) accuracy overall across all 36 clinical vignettes. The LLM demonstrated the highest performance in making a final diagnosis with an accuracy of 76.9% (95% CI, 67.8% to 86.1%), and the lowest performance in generating an initial differential diagnosis with an accuracy of 60.3% (95% CI, 54.2% to 66.6%). Compared to answering questions about general medical knowledge, ChatGPT demonstrated inferior performance on differential diagnosis (ß=-15.8%, p<0.001) and clinical management (ß=-7.4%, p=0.02) type questions. CONCLUSIONS AND RELEVANCE: ChatGPT achieves impressive accuracy in clinical decision making, with particular strengths emerging as it has more clinical information at its disposal.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA