Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 252
Filter
1.
BMJ Open Qual ; 13(2)2024 Jun 03.
Article in English | MEDLINE | ID: mdl-38830730

ABSTRACT

BACKGROUND: Manual chart review using validated assessment tools is a standardised methodology for detecting diagnostic errors. However, this requires considerable human resources and time. ChatGPT, a recently developed artificial intelligence chatbot based on a large language model, can effectively classify text based on suitable prompts. Therefore, ChatGPT can assist manual chart reviews in detecting diagnostic errors. OBJECTIVE: This study aimed to clarify whether ChatGPT could correctly detect diagnostic errors and possible factors contributing to them based on case presentations. METHODS: We analysed 545 published case reports that included diagnostic errors. We imputed the texts of case presentations and the final diagnoses with some original prompts into ChatGPT (GPT-4) to generate responses, including the judgement of diagnostic errors and contributing factors of diagnostic errors. Factors contributing to diagnostic errors were coded according to the following three taxonomies: Diagnosis Error Evaluation and Research (DEER), Reliable Diagnosis Challenges (RDC) and Generic Diagnostic Pitfalls (GDP). The responses on the contributing factors from ChatGPT were compared with those from physicians. RESULTS: ChatGPT correctly detected diagnostic errors in 519/545 cases (95%) and coded statistically larger numbers of factors contributing to diagnostic errors per case than physicians: DEER (median 5 vs 1, p<0.001), RDC (median 4 vs 2, p<0.001) and GDP (median 4 vs 1, p<0.001). The most important contributing factors of diagnostic errors coded by ChatGPT were 'failure/delay in considering the diagnosis' (315, 57.8%) in DEER, 'atypical presentation' (365, 67.0%) in RDC, and 'atypical presentation' (264, 48.4%) in GDP. CONCLUSION: ChatGPT accurately detects diagnostic errors from case presentations. ChatGPT may be more sensitive than manual reviewing in detecting factors contributing to diagnostic errors, especially for 'atypical presentation'.


Subject(s)
Diagnostic Errors , Humans , Diagnostic Errors/statistics & numerical data , Artificial Intelligence/standards
3.
J Med Internet Res ; 26: e54705, 2024 May 22.
Article in English | MEDLINE | ID: mdl-38776538

ABSTRACT

BACKGROUND: In recent years, there has been an upwelling of artificial intelligence (AI) studies in the health care literature. During this period, there has been an increasing number of proposed standards to evaluate the quality of health care AI studies. OBJECTIVE: This rapid umbrella review examines the use of AI quality standards in a sample of health care AI systematic review articles published over a 36-month period. METHODS: We used a modified version of the Joanna Briggs Institute umbrella review method. Our rapid approach was informed by the practical guide by Tricco and colleagues for conducting rapid reviews. Our search was focused on the MEDLINE database supplemented with Google Scholar. The inclusion criteria were English-language systematic reviews regardless of review type, with mention of AI and health in the abstract, published during a 36-month period. For the synthesis, we summarized the AI quality standards used and issues noted in these reviews drawing on a set of published health care AI standards, harmonized the terms used, and offered guidance to improve the quality of future health care AI studies. RESULTS: We selected 33 review articles published between 2020 and 2022 in our synthesis. The reviews covered a wide range of objectives, topics, settings, designs, and results. Over 60 AI approaches across different domains were identified with varying levels of detail spanning different AI life cycle stages, making comparisons difficult. Health care AI quality standards were applied in only 39% (13/33) of the reviews and in 14% (25/178) of the original studies from the reviews examined, mostly to appraise their methodological or reporting quality. Only a handful mentioned the transparency, explainability, trustworthiness, ethics, and privacy aspects. A total of 23 AI quality standard-related issues were identified in the reviews. There was a recognized need to standardize the planning, conduct, and reporting of health care AI studies and address their broader societal, ethical, and regulatory implications. CONCLUSIONS: Despite the growing number of AI standards to assess the quality of health care AI studies, they are seldom applied in practice. With increasing desire to adopt AI in different health topics, domains, and settings, practitioners and researchers must stay abreast of and adapt to the evolving landscape of health care AI quality standards and apply these standards to improve the quality of their AI studies.


Subject(s)
Artificial Intelligence , Artificial Intelligence/standards , Humans , Delivery of Health Care/standards , Quality of Health Care/standards
5.
JMIR Mhealth Uhealth ; 12: e57978, 2024 May 06.
Article in English | MEDLINE | ID: mdl-38688841

ABSTRACT

The increasing interest in the potential applications of generative artificial intelligence (AI) models like ChatGPT in health care has prompted numerous studies to explore its performance in various medical contexts. However, evaluating ChatGPT poses unique challenges due to the inherent randomness in its responses. Unlike traditional AI models, ChatGPT generates different responses for the same input, making it imperative to assess its stability through repetition. This commentary highlights the importance of including repetition in the evaluation of ChatGPT to ensure the reliability of conclusions drawn from its performance. Similar to biological experiments, which often require multiple repetitions for validity, we argue that assessing generative AI models like ChatGPT demands a similar approach. Failure to acknowledge the impact of repetition can lead to biased conclusions and undermine the credibility of research findings. We urge researchers to incorporate appropriate repetition in their studies from the outset and transparently report their methods to enhance the robustness and reproducibility of findings in this rapidly evolving field.


Subject(s)
Artificial Intelligence , Humans , Artificial Intelligence/trends , Artificial Intelligence/standards , Reproducibility of Results
6.
Curr Pharm Teach Learn ; 16(6): 404-410, 2024 06.
Article in English | MEDLINE | ID: mdl-38641483

ABSTRACT

OBJECTIVES: ChatGPT is an innovative artificial intelligence designed to enhance human activities and serve as a potent tool for information retrieval. This study aimed to evaluate the performance and limitation of ChatGPT on fourth-year pharmacy student examination. METHODS: This cross-sectional study was conducted on February 2023 at the Faculty of Pharmacy, Chiang Mai University, Thailand. The exam contained 16 multiple-choice questions and 2 short-answer questions, focusing on classification and medical management of shock and electrolyte disorders. RESULTS: Out of the 18 questions, ChatGPT provided 44% (8 out of 18) correct responses. In contrast, the students provided a higher accuracy rate with 66% (12 out of 18) correctly answered questions. The findings of this study underscore that while AI exhibits proficiency, it encounters limitations when confronted with specific queries derived from practical scenarios, on the contrary with pharmacy students who possess the liberty to explore and collaborate, mirroring real-world scenarios. CONCLUSIONS: Users must exercise caution regarding its reliability, and interpretations of AI-generated answers should be approached judiciously due to potential restrictions in multi-step analysis and reliance on outdated data. Future advancements in AI models, with refinements and tailored enhancements, offer the potential for improved performance.


Subject(s)
Educational Measurement , Students, Pharmacy , Humans , Thailand , Students, Pharmacy/statistics & numerical data , Students, Pharmacy/psychology , Cross-Sectional Studies , Educational Measurement/methods , Educational Measurement/statistics & numerical data , Education, Pharmacy/methods , Education, Pharmacy/standards , Education, Pharmacy/statistics & numerical data , Artificial Intelligence/standards , Artificial Intelligence/trends , Artificial Intelligence/statistics & numerical data , Male , Female , Reproducibility of Results , Adult
13.
JAMA ; 331(1): 65-69, 2024 01 02.
Article in English | MEDLINE | ID: mdl-38032660

ABSTRACT

Importance: Since the introduction of ChatGPT in late 2022, generative artificial intelligence (genAI) has elicited enormous enthusiasm and serious concerns. Observations: History has shown that general purpose technologies often fail to deliver their promised benefits for many years ("the productivity paradox of information technology"). Health care has several attributes that make the successful deployment of new technologies even more difficult than in other industries; these have challenged prior efforts to implement AI and electronic health records. However, genAI has unique properties that may shorten the usual lag between implementation and productivity and/or quality gains in health care. Moreover, the health care ecosystem has evolved to make it more receptive to genAI, and many health care organizations are poised to implement the complementary innovations in culture, leadership, workforce, and workflow often needed for digital innovations to flourish. Conclusions and Relevance: The ability of genAI to rapidly improve and the capacity of organizations to implement complementary innovations that allow IT tools to reach their potential are more advanced than in the past; thus, genAI is capable of delivering meaningful improvements in health care more rapidly than was the case with previous technologies.


Subject(s)
Artificial Intelligence , Delivery of Health Care , Artificial Intelligence/standards , Artificial Intelligence/trends , Delivery of Health Care/methods , Delivery of Health Care/trends , Diffusion of Innovation
14.
JAMA ; 331(3): 245-249, 2024 01 16.
Article in English | MEDLINE | ID: mdl-38117493

ABSTRACT

Importance: Given the importance of rigorous development and evaluation standards needed of artificial intelligence (AI) models used in health care, nationwide accepted procedures to provide assurance that the use of AI is fair, appropriate, valid, effective, and safe are urgently needed. Observations: While there are several efforts to develop standards and best practices to evaluate AI, there is a gap between having such guidance and the application of such guidance to both existing and new AI models being developed. As of now, there is no publicly available, nationwide mechanism that enables objective evaluation and ongoing assessment of the consequences of using health AI models in clinical care settings. Conclusion and Relevance: The need to create a public-private partnership to support a nationwide health AI assurance labs network is outlined here. In this network, community best practices could be applied for testing health AI models to produce reports on their performance that can be widely shared for managing the lifecycle of AI models over time and across populations and sites where these models are deployed.


Subject(s)
Artificial Intelligence , Delivery of Health Care , Laboratories , Quality Assurance, Health Care , Quality of Health Care , Artificial Intelligence/standards , Health Facilities/standards , Laboratories/standards , Public-Private Sector Partnerships , Quality Assurance, Health Care/standards , Delivery of Health Care/standards , Quality of Health Care/standards , United States
20.
Nature ; 620(7972): 47-60, 2023 Aug.
Article in English | MEDLINE | ID: mdl-37532811

ABSTRACT

Artificial intelligence (AI) is being increasingly integrated into scientific discovery to augment and accelerate research, helping scientists to generate hypotheses, design experiments, collect and interpret large datasets, and gain insights that might not have been possible using traditional scientific methods alone. Here we examine breakthroughs over the past decade that include self-supervised learning, which allows models to be trained on vast amounts of unlabelled data, and geometric deep learning, which leverages knowledge about the structure of scientific data to enhance model accuracy and efficiency. Generative AI methods can create designs, such as small-molecule drugs and proteins, by analysing diverse data modalities, including images and sequences. We discuss how these methods can help scientists throughout the scientific process and the central issues that remain despite such advances. Both developers and users of AI toolsneed a better understanding of when such approaches need improvement, and challenges posed by poor data quality and stewardship remain. These issues cut across scientific disciplines and require developing foundational algorithmic approaches that can contribute to scientific understanding or acquire it autonomously, making them critical areas of focus for AI innovation.


Subject(s)
Artificial Intelligence , Research Design , Artificial Intelligence/standards , Artificial Intelligence/trends , Datasets as Topic , Deep Learning , Research Design/standards , Research Design/trends , Unsupervised Machine Learning
SELECTION OF CITATIONS
SEARCH DETAIL