Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 257
Filtrar
1.
BMJ Open Qual ; 13(2)2024 Jun 03.
Artículo en Inglés | MEDLINE | ID: mdl-38830730

RESUMEN

BACKGROUND: Manual chart review using validated assessment tools is a standardised methodology for detecting diagnostic errors. However, this requires considerable human resources and time. ChatGPT, a recently developed artificial intelligence chatbot based on a large language model, can effectively classify text based on suitable prompts. Therefore, ChatGPT can assist manual chart reviews in detecting diagnostic errors. OBJECTIVE: This study aimed to clarify whether ChatGPT could correctly detect diagnostic errors and possible factors contributing to them based on case presentations. METHODS: We analysed 545 published case reports that included diagnostic errors. We imputed the texts of case presentations and the final diagnoses with some original prompts into ChatGPT (GPT-4) to generate responses, including the judgement of diagnostic errors and contributing factors of diagnostic errors. Factors contributing to diagnostic errors were coded according to the following three taxonomies: Diagnosis Error Evaluation and Research (DEER), Reliable Diagnosis Challenges (RDC) and Generic Diagnostic Pitfalls (GDP). The responses on the contributing factors from ChatGPT were compared with those from physicians. RESULTS: ChatGPT correctly detected diagnostic errors in 519/545 cases (95%) and coded statistically larger numbers of factors contributing to diagnostic errors per case than physicians: DEER (median 5 vs 1, p<0.001), RDC (median 4 vs 2, p<0.001) and GDP (median 4 vs 1, p<0.001). The most important contributing factors of diagnostic errors coded by ChatGPT were 'failure/delay in considering the diagnosis' (315, 57.8%) in DEER, 'atypical presentation' (365, 67.0%) in RDC, and 'atypical presentation' (264, 48.4%) in GDP. CONCLUSION: ChatGPT accurately detects diagnostic errors from case presentations. ChatGPT may be more sensitive than manual reviewing in detecting factors contributing to diagnostic errors, especially for 'atypical presentation'.


Asunto(s)
Errores Diagnósticos , Humanos , Errores Diagnósticos/estadística & datos numéricos , Inteligencia Artificial/normas
4.
Curr Pharm Teach Learn ; 16(7): 102101, 2024 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-38702261

RESUMEN

INTRODUCTION: Artificial intelligence (AI), particularly ChatGPT, is becoming more and more prevalent in the healthcare field for tasks such as disease diagnosis and medical record analysis. The objective of this study is to evaluate the proficiency and accuracy of ChatGPT in different domains of clinical pharmacy cases and queries. METHODS: The study NAPLEX® Review Questions, 4th edition, pertaining to 10 different chronic conditions compared ChatGPT's responses to pharmacotherapy cases and questions obtained from McGraw Hill's, alongside the answers provided by the book's authors. The proportion of correct responses was collected and analyzed using the Statistical Package for the Social Sciences (SPSS) version 29. RESULTS: When tested in English, ChatGPT had substantially higher mean scores than when tested in Turkish. The average accurate score for English and Turkish was 0.41 ± 0.49 and 0.32 ± 0.46, respectively, p = 0.18. Responses to queries beginning with "Which of the following is correct?" are considerably more precise than those beginning with "Mark all the incorrect answers?" 0.66 ± 0.47 as opposed to 0.16 ± 0.36; p = 0.01 in English language and 0.50 ± 0.50 as opposed to 0.14 ± 0.34; p < 0.05in Turkish language. CONCLUSION: ChatGPT displayed a moderate level of accuracy while responding to English inquiries, but it displayed a slight level of accuracy when responding to Turkish inquiries, contingent upon the question format. Improving the accuracy of ChatGPT in languages other than English requires the incorporation of several components. The integration of the English version of ChatGPT into clinical practice has the potential to improve the effectiveness, precision, and standard of patient care provision by supplementing personal expertise and professional judgment. However, it is crucial to utilize technology as an adjunct and not a replacement for human decision-making and critical thinking.


Asunto(s)
Inteligencia Artificial , Humanos , Turquía , Reproducibilidad de los Resultados , Inteligencia Artificial/normas , Encuestas y Cuestionarios , Lenguaje
7.
J Med Internet Res ; 26: e54705, 2024 May 22.
Artículo en Inglés | MEDLINE | ID: mdl-38776538

RESUMEN

BACKGROUND: In recent years, there has been an upwelling of artificial intelligence (AI) studies in the health care literature. During this period, there has been an increasing number of proposed standards to evaluate the quality of health care AI studies. OBJECTIVE: This rapid umbrella review examines the use of AI quality standards in a sample of health care AI systematic review articles published over a 36-month period. METHODS: We used a modified version of the Joanna Briggs Institute umbrella review method. Our rapid approach was informed by the practical guide by Tricco and colleagues for conducting rapid reviews. Our search was focused on the MEDLINE database supplemented with Google Scholar. The inclusion criteria were English-language systematic reviews regardless of review type, with mention of AI and health in the abstract, published during a 36-month period. For the synthesis, we summarized the AI quality standards used and issues noted in these reviews drawing on a set of published health care AI standards, harmonized the terms used, and offered guidance to improve the quality of future health care AI studies. RESULTS: We selected 33 review articles published between 2020 and 2022 in our synthesis. The reviews covered a wide range of objectives, topics, settings, designs, and results. Over 60 AI approaches across different domains were identified with varying levels of detail spanning different AI life cycle stages, making comparisons difficult. Health care AI quality standards were applied in only 39% (13/33) of the reviews and in 14% (25/178) of the original studies from the reviews examined, mostly to appraise their methodological or reporting quality. Only a handful mentioned the transparency, explainability, trustworthiness, ethics, and privacy aspects. A total of 23 AI quality standard-related issues were identified in the reviews. There was a recognized need to standardize the planning, conduct, and reporting of health care AI studies and address their broader societal, ethical, and regulatory implications. CONCLUSIONS: Despite the growing number of AI standards to assess the quality of health care AI studies, they are seldom applied in practice. With increasing desire to adopt AI in different health topics, domains, and settings, practitioners and researchers must stay abreast of and adapt to the evolving landscape of health care AI quality standards and apply these standards to improve the quality of their AI studies.


Asunto(s)
Inteligencia Artificial , Inteligencia Artificial/normas , Humanos , Atención a la Salud/normas , Calidad de la Atención de Salud/normas
11.
JMIR Mhealth Uhealth ; 12: e57978, 2024 May 06.
Artículo en Inglés | MEDLINE | ID: mdl-38688841

RESUMEN

The increasing interest in the potential applications of generative artificial intelligence (AI) models like ChatGPT in health care has prompted numerous studies to explore its performance in various medical contexts. However, evaluating ChatGPT poses unique challenges due to the inherent randomness in its responses. Unlike traditional AI models, ChatGPT generates different responses for the same input, making it imperative to assess its stability through repetition. This commentary highlights the importance of including repetition in the evaluation of ChatGPT to ensure the reliability of conclusions drawn from its performance. Similar to biological experiments, which often require multiple repetitions for validity, we argue that assessing generative AI models like ChatGPT demands a similar approach. Failure to acknowledge the impact of repetition can lead to biased conclusions and undermine the credibility of research findings. We urge researchers to incorporate appropriate repetition in their studies from the outset and transparently report their methods to enhance the robustness and reproducibility of findings in this rapidly evolving field.


Asunto(s)
Inteligencia Artificial , Humanos , Inteligencia Artificial/tendencias , Inteligencia Artificial/normas , Reproducibilidad de los Resultados
12.
Curr Pharm Teach Learn ; 16(6): 404-410, 2024 06.
Artículo en Inglés | MEDLINE | ID: mdl-38641483

RESUMEN

OBJECTIVES: ChatGPT is an innovative artificial intelligence designed to enhance human activities and serve as a potent tool for information retrieval. This study aimed to evaluate the performance and limitation of ChatGPT on fourth-year pharmacy student examination. METHODS: This cross-sectional study was conducted on February 2023 at the Faculty of Pharmacy, Chiang Mai University, Thailand. The exam contained 16 multiple-choice questions and 2 short-answer questions, focusing on classification and medical management of shock and electrolyte disorders. RESULTS: Out of the 18 questions, ChatGPT provided 44% (8 out of 18) correct responses. In contrast, the students provided a higher accuracy rate with 66% (12 out of 18) correctly answered questions. The findings of this study underscore that while AI exhibits proficiency, it encounters limitations when confronted with specific queries derived from practical scenarios, on the contrary with pharmacy students who possess the liberty to explore and collaborate, mirroring real-world scenarios. CONCLUSIONS: Users must exercise caution regarding its reliability, and interpretations of AI-generated answers should be approached judiciously due to potential restrictions in multi-step analysis and reliance on outdated data. Future advancements in AI models, with refinements and tailored enhancements, offer the potential for improved performance.


Asunto(s)
Evaluación Educacional , Estudiantes de Farmacia , Humanos , Tailandia , Estudiantes de Farmacia/estadística & datos numéricos , Estudiantes de Farmacia/psicología , Estudios Transversales , Evaluación Educacional/métodos , Evaluación Educacional/estadística & datos numéricos , Educación en Farmacia/métodos , Educación en Farmacia/normas , Educación en Farmacia/estadística & datos numéricos , Inteligencia Artificial/normas , Inteligencia Artificial/tendencias , Inteligencia Artificial/estadística & datos numéricos , Masculino , Femenino , Reproducibilidad de los Resultados , Adulto
18.
JAMA ; 331(1): 65-69, 2024 01 02.
Artículo en Inglés | MEDLINE | ID: mdl-38032660

RESUMEN

Importance: Since the introduction of ChatGPT in late 2022, generative artificial intelligence (genAI) has elicited enormous enthusiasm and serious concerns. Observations: History has shown that general purpose technologies often fail to deliver their promised benefits for many years ("the productivity paradox of information technology"). Health care has several attributes that make the successful deployment of new technologies even more difficult than in other industries; these have challenged prior efforts to implement AI and electronic health records. However, genAI has unique properties that may shorten the usual lag between implementation and productivity and/or quality gains in health care. Moreover, the health care ecosystem has evolved to make it more receptive to genAI, and many health care organizations are poised to implement the complementary innovations in culture, leadership, workforce, and workflow often needed for digital innovations to flourish. Conclusions and Relevance: The ability of genAI to rapidly improve and the capacity of organizations to implement complementary innovations that allow IT tools to reach their potential are more advanced than in the past; thus, genAI is capable of delivering meaningful improvements in health care more rapidly than was the case with previous technologies.


Asunto(s)
Inteligencia Artificial , Atención a la Salud , Inteligencia Artificial/normas , Inteligencia Artificial/tendencias , Atención a la Salud/métodos , Atención a la Salud/tendencias , Difusión de Innovaciones
19.
JAMA ; 331(3): 245-249, 2024 01 16.
Artículo en Inglés | MEDLINE | ID: mdl-38117493

RESUMEN

Importance: Given the importance of rigorous development and evaluation standards needed of artificial intelligence (AI) models used in health care, nationwide accepted procedures to provide assurance that the use of AI is fair, appropriate, valid, effective, and safe are urgently needed. Observations: While there are several efforts to develop standards and best practices to evaluate AI, there is a gap between having such guidance and the application of such guidance to both existing and new AI models being developed. As of now, there is no publicly available, nationwide mechanism that enables objective evaluation and ongoing assessment of the consequences of using health AI models in clinical care settings. Conclusion and Relevance: The need to create a public-private partnership to support a nationwide health AI assurance labs network is outlined here. In this network, community best practices could be applied for testing health AI models to produce reports on their performance that can be widely shared for managing the lifecycle of AI models over time and across populations and sites where these models are deployed.


Asunto(s)
Inteligencia Artificial , Atención a la Salud , Laboratorios , Garantía de la Calidad de Atención de Salud , Calidad de la Atención de Salud , Inteligencia Artificial/normas , Instituciones de Salud/normas , Laboratorios/normas , Asociación entre el Sector Público-Privado , Garantía de la Calidad de Atención de Salud/normas , Atención a la Salud/normas , Calidad de la Atención de Salud/normas , Estados Unidos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA