Pesquisa | Biblioteca Virtual em Saúde

1.

How should artificial intelligence be used in Australian health care? Recommendations from a citizens' jury.

Carter, Stacy M; Aquino, Yves Saint James; Carolan, Lucy; Frost, Emma; Degeling, Chris; Rogers, Wendy A; Scott, Ian A; Bell, Katy Jl; Fabrianesi, Belinda; Magrabi, Farah.

Med J Aust ; 220(8): 409-416, 2024 May 06.

Artigo em Inglês | MEDLINE | ID: mdl-38629188

RESUMO

OBJECTIVE: To support a diverse sample of Australians to make recommendations about the use of artificial intelligence (AI) technology in health care. STUDY DESIGN: Citizens' jury, deliberating the question: "Under which circumstances, if any, should artificial intelligence be used in Australian health systems to detect or diagnose disease?" SETTING, PARTICIPANTS: Thirty Australian adults recruited by Sortition Foundation using random invitation and stratified selection to reflect population proportions by gender, age, ancestry, highest level of education, and residential location (state/territory; urban, regional, rural). The jury process took 18 days (16 March - 2 April 2023): fifteen days online and three days face-to-face in Sydney, where the jurors, both in small groups and together, were informed about and discussed the question, and developed recommendations with reasons. Jurors received extensive information: a printed handbook, online documents, and recorded presentations by four expert speakers. Jurors asked questions and received answers from the experts during the online period of the process, and during the first day of the face-to-face meeting. MAIN OUTCOME MEASURES: Jury recommendations, with reasons. RESULTS: The jurors recommended an overarching, independently governed charter and framework for health care AI. The other nine recommendation categories concerned balancing benefits and harms; fairness and bias; patients' rights and choices; clinical governance and training; technical governance and standards; data governance and use; open source software; AI evaluation and assessment; and education and communication. CONCLUSIONS: The deliberative process supported a nationally representative sample of citizens to construct recommendations about how AI in health care should be developed, used, and governed. Recommendations derived using such methods could guide clinicians, policy makers, AI researchers and developers, and health service users to develop approaches that ensure trustworthy and responsible use of this technology.

Assuntos

Inteligência Artificial , Humanos , Austrália , Feminino , Masculino , Adulto , Atenção à Saúde , Pessoa de Meia-Idade , Idoso

2.

A qualitative analysis of health service problems and the strategies used to manage them in the COVID-19 pandemic: exploiting generic and context-specific approaches.

Rahimi-Ardabili, Hania; Magrabi, Farah; Sanderson, Brenton; Schuler, Thilo; Coiera, Enrico.

BMC Health Serv Res ; 24(1): 1067, 2024 Sep 13.

Artigo em Inglês | MEDLINE | ID: mdl-39272078

RESUMO

BACKGROUND: The COVID-19 pandemic disrupted health systems around the globe. Lessons from health systems responses to these challenges may help design effective and sustainable health system responses for future challenges. This study aimed to 1/ identify the broad types of health system challenges faced during the pandemic and 2/ develop a typology of health system response to these challenges. METHODS: Semi-structured one-on-one online interviews explored the experience of 19 health professionals during COVID-19 in a large state health system in Australia. Data were analysed using constant comparative analysis utilising a sociotechnical system lens. RESULTS: Participants described four overarching challenges: 1/ System overload, 2/ Barriers to decision-making, 3/ Education or training gaps, and 4/ Limitations of existing services. The limited time often available to respond meant that specific and well-designed strategies were often not possible, and more generic strategies that relied on the workforce to modify solutions and repair unexpected gaps were common. For example, generic responses to system overload included working longer hours, whilst specific strategies utilised pre-existing technical resources (e.g. converting non-emergency wards into COVID-19 wards). CONCLUSION: During the pandemic, it was often not possible to rely on mature strategies to frame responses, and more generic, emergent approaches were commonly required when urgent responses were needed. The degree to which specific strategies were ready-to-hand appeared to dictate how much a strategy relied on such generic approaches. The workforce played a pivotal role in enabling emergent responses that required dealing with uncertainties.

Assuntos

COVID-19 , Pandemias , Pesquisa Qualitativa , COVID-19/epidemiologia , Humanos , Austrália/epidemiologia , SARS-CoV-2 , Atenção à Saúde/organização & administração , Pessoal de Saúde/psicologia , Entrevistas como Assunto , Feminino , Masculino

3.

Evaluating Artificial Intelligence in Clinical Settings-Let Us Not Reinvent the Wheel.

Cresswell, Kathrin; de Keizer, Nicolette; Magrabi, Farah; Williams, Robin; Rigby, Michael; Prgomet, Mirela; Kukhareva, Polina; Wong, Zoie Shui-Yee; Scott, Philip; Craven, Catherine K; Georgiou, Andrew; Medlock, Stephanie; Brender McNair, Jytte; Ammenwerth, Elske.

J Med Internet Res ; 26: e46407, 2024 Aug 07.

Artigo em Inglês | MEDLINE | ID: mdl-39110494

RESUMO

Given the requirement to minimize the risks and maximize the benefits of technology applications in health care provision, there is an urgent need to incorporate theory-informed health IT (HIT) evaluation frameworks into existing and emerging guidelines for the evaluation of artificial intelligence (AI). Such frameworks can help developers, implementers, and strategic decision makers to build on experience and the existing empirical evidence base. We provide a pragmatic conceptual overview of selected concrete examples of how existing theory-informed HIT evaluation frameworks may be used to inform the safe development and implementation of AI in health care settings. The list is not exhaustive and is intended to illustrate applications in line with various stakeholder requirements. Existing HIT evaluation frameworks can help to inform AI-based development and implementation by supporting developers and strategic decision makers in considering relevant technology, user, and organizational dimensions. This can facilitate the design of technologies, their implementation in user and organizational settings, and the sustainability and scalability of technologies.

Assuntos

Inteligência Artificial , Humanos , Informática Médica/métodos

4.

Clinical decision support versus a paper-based protocol for massive transfusion: Impact on decision outcomes in a simulation study.

Sanderson, Brenton J; Field, Jeremy D; Kocaballi, Ahmet B; Estcourt, Lise J; Magrabi, Farah; Wood, Erica M; Coiera, Enrico.

Transfusion ; 63(12): 2225-2233, 2023 12.

Artigo em Inglês | MEDLINE | ID: mdl-37921017

RESUMO

BACKGROUND: Management of major hemorrhage frequently requires massive transfusion (MT) support, which should be delivered effectively and efficiently. We have previously developed a clinical decision support system (CDS) for MT using a multicenter multidisciplinary user-centered design study. Here we examine its impact when administering a MT. STUDY DESIGN AND METHODS: We conducted a randomized simulation trial to compare a CDS for MT with a paper-based MT protocol for the management of simulated hemorrhage. A total of 44 specialist physicians, trainees (residents), and nurses were recruited across critical care to participate in two 20-min simulated bleeding scenarios. The primary outcome was the decision velocity (correct decisions per hour) and overall task completion. Secondary outcomes included cognitive workload and System Usability Scale (SUS). RESULTS: There was a statistically significant increase in decision velocity for CDS-based management (mean 8.5 decisions per hour) compared to paper based (mean 6.9 decisions per hour; p .003, 95% CI 0.6-2.6). There was no significant difference in the overall task completion using CDS-based management (mean 13.3) compared to paper-based (mean 13.2; p .92, 95% CI -1.2-1.3). Cognitive workload was statistically significantly lower using the CDS compared to the paper protocol (mean 57.1 vs. mean 64.5, p .005, 95% CI 2.4-12.5). CDS usability was assessed as a SUS score of 82.5 (IQR 75-87.5). DISCUSSION: Compared to paper-based management, CDS-based MT supports more time-efficient decision-making by users with limited CDS training and achieves similar overall task completion while reducing cognitive load. Clinical implementation will determine whether the benefits demonstrated translate to improved patient outcomes.

Assuntos

Sistemas de Apoio a Decisões Clínicas , Humanos , Simulação por Computador , Hemorragia , Estudos Multicêntricos como Assunto , Carga de Trabalho

5.

Multicenter, multidisciplinary user-centered design of a clinical decision-support and simulation system for massive transfusion.

Sanderson, Brenton; Field, Jeremy D; Kocaballi, Ahmet B; Estcourt, Lise J; Magrabi, Farah; Wood, Erica M; Coiera, Enrico W.

Transfusion ; 63(5): 993-1004, 2023 05.

Artigo em Inglês | MEDLINE | ID: mdl-36960741

RESUMO

BACKGROUND: Managing critical bleeding with massive transfusion (MT) requires a multidisciplinary team, often physically separated, to perform several simultaneous tasks at short notice. This places a significant cognitive load on team members, who must maintain situational awareness in rapidly changing scenarios. Similar resuscitation scenarios have benefited from the use of clinical decision support (CDS) tools. STUDY DESIGN AND METHODS: A multicenter, multidisciplinary, user-centered design (UCD) study was conducted to design a computerized CDS for MT. This study included analysis of the problem context with a cognitive walkthrough, development of a user requirement statement, and co-design with users of prototypes for testing. The final prototype was evaluated using qualitative assessment and the System Usability Scale (SUS). RESULTS: Eighteen participants were recruited across four institutions. The first UCD cycle resulted in the development of four prototype interfaces that addressed the user requirements and context of implementation. Of these, the preferred interface was further developed in the second UCD cycle to create a high-fidelity web-based CDS for MT. This prototype was evaluated by 15 participants using a simulated bleeding scenario and demonstrated an average SUS of 69.3 (above average, SD 16) and a clear interface with easy-to-follow blood product tracking. DISCUSSION: We used a UCD process to explore a highly complex clinical scenario and develop a prototype CDS for MT that incorporates distributive situational awareness, supports multiple user roles, and allows simulated MT training. Evaluation of the impact of this prototype on the efficacy and efficiency of managing MT is currently underway.

Assuntos

Sistemas de Apoio a Decisões Clínicas , Humanos , Design Centrado no Usuário , Transfusão de Sangue , Conscientização , Simulação por Computador

6.

Artificial intelligence in medicine: has the time come to hang up the stethoscope?

Lyell, David; Magrabi, Farah.

Intern Med J ; 53(9): 1533-1539, 2023 09.

Artigo em Inglês | MEDLINE | ID: mdl-37683094

RESUMO

The question of whether the time has come to hang up the stethoscope is bound up in the promises of artificial intelligence (AI), promises that have so far proven difficult to deliver, perhaps because of the mismatch between the technical capability of AI and its use in real-world clinical settings. This perspective argues that it is time to move away from discussing the generalised promise of disembodied AI and focus on specifics. We need to focus on how the computational method underlying AI, i.e. machine learning (ML), is embedded into tools, how those tools contribute to clinical tasks and decisions and to what extent they can be relied on. Accordingly, we pose four questions that must be asked to make the discussion real and to understand how ML tools contribute to health care: (1) What does the ML algorithm do? (2) How is output of the ML algorithm used in clinical tools? (3) What does the ML tool contribute to clinical tasks or decisions? (4) Can clinicians act or rely on the ML tool? Two exemplar ML tools are examined to show how these questions can be used to better understand the role of ML in supporting clinical tasks and decisions. Ultimately, ML is just a fancy method of automation. We show that it is useful in automating specific and narrowly defined clinical tasks but likely incapable of automating the full gamut of decisions and tasks performed by clinicians.

Assuntos

Medicina , Estetoscópios , Humanos , Inteligência Artificial , Algoritmos

7.

Ethical Guidance for Hard Decisions: A Critical Review of Early International COVID-19 ICU Triage Guidelines.

Aquino, Yves Saint James; Rogers, Wendy A; Scully, Jackie Leach; Magrabi, Farah; Carter, Stacy M.

Health Care Anal ; 30(2): 163-195, 2022 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-34704198

RESUMO

This article provides a critical comparative analysis of the substantive and procedural values and ethical concepts articulated in guidelines for allocating scarce resources in the COVID-19 pandemic. We identified 21 local and national guidelines written in English, Spanish, German and French; applicable to specific and identifiable jurisdictions; and providing guidance to clinicians for decision making when allocating critical care resources during the COVID-19 pandemic. US guidelines were not included, as these had recently been reviewed elsewhere. Information was extracted from each guideline on: 1) the development process; 2) the presence and nature of ethical, medical and social criteria for allocating critical care resources; and 3) the membership of and decision-making procedure of any triage committees. Results of our analysis show the majority appealed primarily to consequentialist reasoning in making allocation decisions, tempered by a largely pluralistic approach to other substantive and procedural values and ethical concepts. Medical and social criteria included medical need, co-morbidities, prognosis, age, disability and other factors, with a focus on seemingly objective medical criteria. There was little or no guidance on how to reconcile competing criteria, and little attention to internal contradictions within individual guidelines. Our analysis reveals the challenges in developing sound ethical guidance for allocating scarce medical resources, highlighting problems in operationalising ethical concepts and principles, divergence between guidelines, unresolved contradictions within the same guideline, and use of naïve objectivism in employing widely used medical criteria for allocating ICU resources.

Assuntos

COVID-19 , COVID-19/epidemiologia , Cuidados Críticos , Alocação de Recursos para a Atenção à Saúde , Humanos , Unidades de Terapia Intensiva , Pandemias , Triagem/métodos

8.

Evaluating the Impact of the Grading and Assessment of Predictive Tools Framework on Clinicians and Health Care Professionals' Decisions in Selecting Clinical Predictive Tools: Randomized Controlled Trial.

Khalifa, Mohamed; Magrabi, Farah; Gallego Luxan, Blanca.

J Med Internet Res ; 22(7): e15770, 2020 07 09.

Artigo em Inglês | MEDLINE | ID: mdl-32673228

RESUMO

BACKGROUND: While selecting predictive tools for implementation in clinical practice or for recommendation in clinical guidelines, clinicians and health care professionals are challenged with an overwhelming number of tools. Many of these tools have never been implemented or evaluated for comparative effectiveness. To overcome this challenge, the authors developed and validated an evidence-based framework for grading and assessment of predictive tools (the GRASP framework). This framework was based on the critical appraisal of the published evidence on such tools. OBJECTIVE: The aim of the study was to examine the impact of using the GRASP framework on clinicians' and health care professionals' decisions in selecting clinical predictive tools. METHODS: A controlled experiment was conducted through a web-based survey. Participants were randomized to either review the derivation publications, such as studies describing the development of the predictive tools, on common traumatic brain injury predictive tools (control group) or to review an evidence-based summary, where each tool had been graded and assessed using the GRASP framework (intervention group). Participants in both groups were asked to select the best tool based on the greatest validation or implementation. A wide group of international clinicians and health care professionals were invited to participate in the survey. Task completion time, rate of correct decisions, rate of objective versus subjective decisions, and level of decisional conflict were measured. RESULTS: We received a total of 194 valid responses. In comparison with not using GRASP, using the framework significantly increased correct decisions by 64%, from 53.7% to 88.1% (88.1/53.7=1.64; t193=8.53; P<.001); increased objective decision making by 32%, from 62% (3.11/5) to 82% (4.10/5; t189=9.24; P<.001); decreased subjective decision making based on guessing by 20%, from 49% (2.48/5) to 39% (1.98/5; t188=-5.47; P<.001); and decreased prior knowledge or experience by 8%, from 71% (3.55/5) to 65% (3.27/5; t187=-2.99; P=.003). Using GRASP significantly decreased decisional conflict and increased the confidence and satisfaction of participants with their decisions by 11%, from 71% (3.55/5) to 79% (3.96/5; t188=4.27; P<.001), and by 13%, from 70% (3.54/5) to 79% (3.99/5; t188=4.89; P<.001), respectively. Using GRASP decreased the task completion time, on the 90th percentile, by 52%, from 12.4 to 6.4 min (t193=-0.87; P=.38). The average System Usability Scale of the GRASP framework was very good: 72.5% and 88% (108/122) of the participants found the GRASP useful. CONCLUSIONS: Using GRASP has positively supported and significantly improved evidence-based decision making. It has increased the accuracy and efficiency of selecting predictive tools. GRASP is not meant to be prescriptive; it represents a high-level approach and an effective, evidence-based, and comprehensive yet simple and feasible method to evaluate, compare, and select clinical predictive tools.

Assuntos

Tomada de Decisão Clínica/métodos , Pessoal de Saúde/normas , Feminino , Humanos , Masculino , Inquéritos e Questionários

9.

Effect of Speech Recognition on Problem Solving and Recall in Consumer Digital Health Tasks: Controlled Laboratory Experiment.

Chen, Jessica; Lyell, David; Laranjo, Liliana; Magrabi, Farah.

J Med Internet Res ; 22(6): e14827, 2020 06 01.

Artigo em Inglês | MEDLINE | ID: mdl-32442129

RESUMO

BACKGROUND: Recent advances in natural language processing and artificial intelligence have led to widespread adoption of speech recognition technologies. In consumer health applications, speech recognition is usually applied to support interactions with conversational agents for data collection, decision support, and patient monitoring. However, little is known about the use of speech recognition in consumer health applications and few studies have evaluated the efficacy of conversational agents in the hands of consumers. In other consumer-facing tools, cognitive load has been observed to be an important factor affecting the use of speech recognition technologies in tasks involving problem solving and recall. Users find it more difficult to think and speak at the same time when compared to typing, pointing, and clicking. However, the effects of speech recognition on cognitive load when performing health tasks has not yet been explored. OBJECTIVE: The aim of this study was to evaluate the use of speech recognition for documentation in consumer digital health tasks involving problem solving and recall. METHODS: Fifty university staff and students were recruited to undertake four documentation tasks with a simulated conversational agent in a computer laboratory. The tasks varied in complexity determined by the amount of problem solving and recall required (simple and complex) and the input modality (speech recognition vs keyboard and mouse). Cognitive load, task completion time, error rate, and usability were measured. RESULTS: Compared to using a keyboard and mouse, speech recognition significantly increased the cognitive load for complex tasks (Z=-4.08, P<.001) and simple tasks (Z=-2.24, P=.03). Complex tasks took significantly longer to complete (Z=-2.52, P=.01) and speech recognition was found to be overall less usable than a keyboard and mouse (Z=-3.30, P=.001). However, there was no effect on errors. CONCLUSIONS: Use of a keyboard and mouse was preferable to speech recognition for complex tasks involving problem solving and recall. Further studies using a broader variety of consumer digital health tasks of varying complexity are needed to investigate the contexts in which use of speech recognition is most appropriate. The effects of cognitive load on task performance and its significance also need to be investigated.

Assuntos

Informática Aplicada à Saúde dos Consumidores/métodos , Laboratórios/normas , Resolução de Problemas/fisiologia , Interface para o Reconhecimento da Fala/normas , Adolescente , Adulto , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Adulto Jovem

10.

Responses of Conversational Agents to Health and Lifestyle Prompts: Investigation of Appropriateness and Presentation Structures.

Kocaballi, Ahmet Baki; Quiroz, Juan C; Rezazadegan, Dana; Berkovsky, Shlomo; Magrabi, Farah; Coiera, Enrico; Laranjo, Liliana.

J Med Internet Res ; 22(2): e15823, 2020 02 09.

Artigo em Inglês | MEDLINE | ID: mdl-32039810

RESUMO

BACKGROUND: Conversational agents (CAs) are systems that mimic human conversations using text or spoken language. Their widely used examples include voice-activated systems such as Apple Siri, Google Assistant, Amazon Alexa, and Microsoft Cortana. The use of CAs in health care has been on the rise, but concerns about their potential safety risks often remain understudied. OBJECTIVE: This study aimed to analyze how commonly available, general-purpose CAs on smartphones and smart speakers respond to health and lifestyle prompts (questions and open-ended statements) by examining their responses in terms of content and structure alike. METHODS: We followed a piloted script to present health- and lifestyle-related prompts to 8 CAs. The CAs' responses were assessed for their appropriateness on the basis of the prompt type: responses to safety-critical prompts were deemed appropriate if they included a referral to a health professional or service, whereas responses to lifestyle prompts were deemed appropriate if they provided relevant information to address the problem prompted. The response structure was also examined according to information sources (Web search-based or precoded), response content style (informative and/or directive), confirmation of prompt recognition, and empathy. RESULTS: The 8 studied CAs provided in total 240 responses to 30 prompts. They collectively responded appropriately to 41% (46/112) of the safety-critical and 39% (37/96) of the lifestyle prompts. The ratio of appropriate responses deteriorated when safety-critical prompts were rephrased or when the agent used a voice-only interface. The appropriate responses included mostly directive content and empathy statements for the safety-critical prompts and a mix of informative and directive content for the lifestyle prompts. CONCLUSIONS: Our results suggest that the commonly available, general-purpose CAs on smartphones and smart speakers with unconstrained natural language interfaces are limited in their ability to advise on both the safety-critical health prompts and lifestyle prompts. Our study also identified some response structures the CAs employed to present their appropriate responses. Further investigation is needed to establish guidelines for designing suitable response structures for different prompt types.

Assuntos

Comunicação , Estilo de Vida , Humanos

11.

Developing a framework for evidence-based grading and assessment of predictive tools for clinical decision support.

Khalifa, Mohamed; Magrabi, Farah; Gallego, Blanca.

BMC Med Inform Decis Mak ; 19(1): 207, 2019 10 29.

Artigo em Inglês | MEDLINE | ID: mdl-31664998

RESUMO

BACKGROUND: Clinical predictive tools quantify contributions of relevant patient characteristics to derive likelihood of diseases or predict clinical outcomes. When selecting predictive tools for implementation at clinical practice or for recommendation in clinical guidelines, clinicians are challenged with an overwhelming and ever-growing number of tools, most of which have never been implemented or assessed for comparative effectiveness. To overcome this challenge, we have developed a conceptual framework to Grade and Assess Predictive tools (GRASP) that can provide clinicians with a standardised, evidence-based system to support their search for and selection of efficient tools. METHODS: A focused review of the literature was conducted to extract criteria along which tools should be evaluated. An initial framework was designed and applied to assess and grade five tools: LACE Index, Centor Score, Well's Criteria, Modified Early Warning Score, and Ottawa knee rule. After peer review, by six expert clinicians and healthcare researchers, the framework and the grading of the tools were updated. RESULTS: GRASP framework grades predictive tools based on published evidence across three dimensions: 1) Phase of evaluation; 2) Level of evidence; and 3) Direction of evidence. The final grade of a tool is based on the highest phase of evaluation, supported by the highest level of positive evidence, or mixed evidence that supports a positive conclusion. Ottawa knee rule had the highest grade since it has demonstrated positive post-implementation impact on healthcare. LACE Index had the lowest grade, having demonstrated only pre-implementation positive predictive performance. CONCLUSION: GRASP framework builds on widely accepted concepts to provide standardised assessment and evidence-based grading of predictive tools. Unlike other methods, GRASP is based on the critical appraisal of published evidence reporting the tools' predictive performance before implementation, potential effect and usability during implementation, and their post-implementation impact. Implementing the GRASP framework as an online platform can enable clinicians and guideline developers to access standardised and structured reported evidence of existing predictive tools. However, keeping GRASP reports up-to-date would require updating tools' assessments and grades when new evidence becomes available, which can only be done efficiently by employing semi-automated methods for searching and processing the incoming information.

Assuntos

Sistemas de Apoio a Decisões Clínicas , Medicina Baseada em Evidências , Valor Preditivo dos Testes , Atenção à Saúde , Humanos , Funções Verossimilhança , Avaliação de Resultados em Cuidados de Saúde

12.

Delay in reviewing test results prolongs hospital length of stay: a retrospective cohort study.

Ong, Mei-Sing; Magrabi, Farah; Coiera, Enrico.

BMC Health Serv Res ; 18(1): 369, 2018 05 16.

Artigo em Inglês | MEDLINE | ID: mdl-29769074

RESUMO

BACKGROUND: Failure in the timely follow-up of test results has been widely documented, contributing to delayed medical care. Yet, the impact of delay in reviewing test results on hospital length of stay (LOS) has not been studied. We examine the relationship between laboratory tests review time and hospital LOS. METHODS: A retrospective cohort study of inpatients admitted to a metropolitan teaching hospital in Sydney, Australia, between 2011 and 2012 (n = 5804). Generalized linear models were developed to examine the relationship between hospital LOS and cumulative clinician read time (CRT), defined as the time taken by clinicians to review laboratory test results performed during an inpatient stay after they were reported in the computerized test reporting system. The models were adjusted for patients' age, sex, and disease severity (measured by the Charlson Comorbidity index), the number of test panels performed, the number of unreviewed tests pre-discharge, and the cumulative laboratory turnaround time (LTAT) of tests performed during an inpatient stay. RESULTS: Cumulative CRT is significantly associated with prolonged LOS, with each day of delay in reviewing test results increasing the likelihood of prolonged LOS by 13.2% (p < 0.0001). Restricting the analysis to tests with abnormal results strengthened the relationship between cumulative CRT and prolonged LOS, with each day of delay in reviewing test results increasing the likelihood of delayed discharge by 33.6% (p < 0.0001). Increasing age, disease severity and total number of tests were also significantly associated with prolonged LOS. Increasing number of unreviewed tests was negatively associated with prolonged LOS. CONCLUSIONS: Reducing unnecessary hospital LOS has become a critical health policy goal as healthcare costs escalate. Preventing delay in reviewing test results represents an important opportunity to address potentially avoidable hospital stays and unnecessary resource utilization.

Assuntos

Diagnóstico Tardio/estatística & dados numéricos , Testes Diagnósticos de Rotina/estatística & dados numéricos , Tempo de Internação/estatística & dados numéricos , Adulto , Idoso , Feminino , Hospitais Urbanos/estatística & dados numéricos , Humanos , Tempo de Internação/economia , Masculino , Pessoa de Meia-Idade , New South Wales , Alta do Paciente/estatística & dados numéricos , Estudos Retrospectivos , Procedimentos Desnecessários/estatística & dados numéricos

13.

The Effect of Cognitive Load and Task Complexity on Automation Bias in Electronic Prescribing.

Lyell, David; Magrabi, Farah; Coiera, Enrico.

Hum Factors ; 60(7): 1008-1021, 2018 11.

Artigo em Inglês | MEDLINE | ID: mdl-29939764

RESUMO

OBJECTIVE: Determine the relationship between cognitive load (CL) and automation bias (AB). BACKGROUND: Clinical decision support (CDS) for electronic prescribing can improve safety but introduces the risk of AB, where reliance on CDS replaces vigilance in information seeking and processing. We hypothesized high CL generated by high task complexity would increase AB errors. METHOD: One hundred twenty medical students prescribed medicines for clinical scenarios using a simulated e-prescribing system in a randomized controlled experiment. Quality of CDS (correct, incorrect, and no CDS) and task complexity (low and high) were varied. CL, omission errors (failure to detect prescribing errors), and commission errors (acceptance of false positive alerts) were measured. RESULTS: Increasing complexity from low to high significantly increased CL, F(1, 118) = 71.6, p < .001. CDS reduced CL in high-complexity conditions compared to no CDS, F(2, 117) = 4.72, p = .015. Participants who made omission errors in incorrect and no CDS conditions exhibited lower CL compared to those who did not, F(1, 636.49) = 3.79, p = .023. CONCLUSION: Results challenge the notion that AB is triggered by increasing task complexity and associated increases in CL. Omission errors were associated with lower CL, suggesting errors may stem from an insufficient allocation of cognitive resources. APPLICATION: This is the first research to examine the relationship between CL and AB. Findings suggest designers and users of CDS systems need to be aware of the risks of AB. Interventions that increase user vigilance and engagement may be beneficial and deserve further investigation.

Assuntos

Sistemas de Apoio a Decisões Clínicas , Prescrição Eletrônica , Função Executiva/fisiologia , Sistemas Homem-Máquina , Memória de Curto Prazo/fisiologia , Análise e Desempenho de Tarefas , Adulto , Feminino , Humanos , Masculino , Adulto Jovem

14.

Using multiclass classification to automate the identification of patient safety incident reports by type and severity.

Wang, Ying; Coiera, Enrico; Runciman, William; Magrabi, Farah.

BMC Med Inform Decis Mak ; 17(1): 84, 2017 Jun 12.

Artigo em Inglês | MEDLINE | ID: mdl-28606174

RESUMO

BACKGROUND: Approximately 10% of admissions to acute-care hospitals are associated with an adverse event. Analysis of incident reports helps to understand how and why incidents occur and can inform policy and practice for safer care. Unfortunately our capacity to monitor and respond to incident reports in a timely manner is limited by the sheer volumes of data collected. In this study, we aim to evaluate the feasibility of using multiclass classification to automate the identification of patient safety incidents in hospitals. METHODS: Text based classifiers were applied to identify 10 incident types and 4 severity levels. Using the one-versus-one (OvsO) and one-versus-all (OvsA) ensemble strategies, we evaluated regularized logistic regression, linear support vector machine (SVM) and SVM with a radial-basis function (RBF) kernel. Classifiers were trained and tested with "balanced" datasets (n_ Type = 2860, n_ SeverityLevel = 1160) from a state-wide incident reporting system. Testing was also undertaken with imbalanced "stratified" datasets (n_ Type = 6000, n_ SeverityLevel =5950) from the state-wide system and an independent hospital reporting system. Classifier performance was evaluated using a confusion matrix, as well as F-score, precision and recall. RESULTS: The most effective combination was a OvsO ensemble of binary SVM RBF classifiers with binary count feature extraction. For incident type, classifiers performed well on balanced and stratified datasets (F-score: 78.3, 73.9%), but were worse on independent datasets (68.5%). Reports about falls, medications, pressure injury, aggression and blood products were identified with high recall and precision. "Documentation" was the hardest type to identify. For severity level, F-score for severity assessment code (SAC) 1 (extreme risk) was 87.3 and 64% for SAC4 (low risk) on balanced data. With stratified data, high recall was achieved for SAC1 (82.8-84%) but precision was poor (6.8-11.2%). High risk incidents (SAC2) were confused with medium risk incidents (SAC3). CONCLUSIONS: Binary classifier ensembles appear to be a feasible method for identifying incidents by type and severity level. Automated identification should enable safety problems to be detected and addressed in a more timely manner. Multi-label classifiers may be necessary for reports that relate to more than one incident type.

Assuntos

Classificação/métodos , Mineração de Dados/métodos , Informática Médica/métodos , Segurança do Paciente , Gestão de Riscos , Máquina de Vetores de Suporte , Humanos , Segurança do Paciente/estatística & dados numéricos , Gestão de Riscos/estatística & dados numéricos

15.

Automation bias in electronic prescribing.

Lyell, David; Magrabi, Farah; Raban, Magdalena Z; Pont, L G; Baysari, Melissa T; Day, Richard O; Coiera, Enrico.

BMC Med Inform Decis Mak ; 17(1): 28, 2017 03 16.

Artigo em Inglês | MEDLINE | ID: mdl-28302112

RESUMO

BACKGROUND: Clinical decision support (CDS) in e-prescribing can improve safety by alerting potential errors, but introduces new sources of risk. Automation bias (AB) occurs when users over-rely on CDS, reducing vigilance in information seeking and processing. Evidence of AB has been found in other clinical tasks, but has not yet been tested with e-prescribing. This study tests for the presence of AB in e-prescribing and the impact of task complexity and interruptions on AB. METHODS: One hundred and twenty students in the final two years of a medical degree prescribed medicines for nine clinical scenarios using a simulated e-prescribing system. Quality of CDS (correct, incorrect and no CDS) and task complexity (low, low + interruption and high) were varied between conditions. Omission errors (failure to detect prescribing errors) and commission errors (acceptance of false positive alerts) were measured. RESULTS: Compared to scenarios with no CDS, correct CDS reduced omission errors by 38.3% (p < .0001, n = 120), 46.6% (p < .0001, n = 70), and 39.2% (p < .0001, n = 120) for low, low + interrupt and high complexity scenarios respectively. Incorrect CDS increased omission errors by 33.3% (p < .0001, n = 120), 24.5% (p < .009, n = 82), and 26.7% (p < .0001, n = 120). Participants made commission errors, 65.8% (p < .0001, n = 120), 53.5% (p < .0001, n = 82), and 51.7% (p < .0001, n = 120). Task complexity and interruptions had no impact on AB. CONCLUSIONS: This study found evidence of AB omission and commission errors in e-prescribing. Verification of CDS alerts is key to avoiding AB errors. However, interventions focused on this have had limited success to date. Clinicians should remain vigilant to the risks of CDS failures and verify CDS.

Assuntos

Automação/normas , Sistemas de Apoio a Decisões Clínicas/normas , Prescrição Eletrônica/normas , Erros de Medicação/prevenção & controle , Estudantes de Medicina , Humanos

16.

Measuring the effects of computer downtime on hospital pathology processes.

Wang, Ying; Coiera, Enrico; Gallego, Blanca; Concha, Oscar Perez; Ong, Mei-Sing; Tsafnat, Guy; Roffe, David; Jones, Graham; Magrabi, Farah.

J Biomed Inform ; 59: 308-15, 2016 Feb.

Artigo em Inglês | MEDLINE | ID: mdl-26732996

RESUMO

OBJECTIVE: To introduce and evaluate a method that uses electronic medical record (EMR) data to measure the effects of computer system downtime on clinical processes associated with pathology testing and results reporting. MATERIALS AND METHODS: A matched case-control design was used to examine the effects of five downtime events over 11-months, ranging from 5 to 300min. Four indicator tests representing different laboratory workflows were selected to measure delays and errors: potassium, haemoglobon, troponin and activated partial thromboplastin time. Tests exposed to a downtime were matched to tests during unaffected control periods by test type, time of day and day of week. Measures included clinician read time (CRT), laboratory turnaround time (LTAT), and rates of missed reads, futile searches, duplicate orders, and missing test results. RESULTS: The effects of downtime varied with the type of IT problem. When clinicians could not logon to a results reporting system for 17-min, the CRT for potassium and haemoglobon tests was five (10.3 vs. 2.0days) and six times (13.4 vs. 2.1days) longer than control (p=0.01-0.04; p=0.0001-0.003). Clinician follow-up of tests was also delayed by another downtime involving a power outage with a small effect. In contrast, laboratory processing of troponin tests was unaffected by network services and routing problems. Errors including missed reads, futile searches, duplicate orders and missing test results could not be examined because the sample size of affected tests was not sufficient for statistical testing. CONCLUSION: This study demonstrates the feasibility of using routinely collected EMR data with a matched case-control design to measure the effects of downtime on clinical processes. Even brief system downtimes may impact patient care. The methodology has potential to be applied to other clinical processes with established workflows where tasks are pre-defined such as medications management.

Assuntos

Redes de Comunicação de Computadores/normas , Falha de Equipamento/estatística & dados numéricos , Informática Médica/normas , Segurança do Paciente , Estudos de Casos e Controles , Humanos , Laboratórios Hospitalares , Fluxo de Trabalho

17.

Predicting the cumulative risk of death during hospitalization by modeling weekend, weekday and diurnal mortality risks.

Coiera, Enrico; Wang, Ying; Magrabi, Farah; Concha, Oscar Perez; Gallego, Blanca; Runciman, William.

BMC Health Serv Res ; 14: 226, 2014 May 21.

Artigo em Inglês | MEDLINE | ID: mdl-24886152

RESUMO

BACKGROUND: Current prognostic models factor in patient and disease specific variables but do not consider cumulative risks of hospitalization over time. We developed risk models of the likelihood of death associated with cumulative exposure to hospitalization, based on time-varying risks of hospitalization over any given day, as well as day of the week. Model performance was evaluated alone, and in combination with simple disease-specific models. METHOD: Patients admitted between 2000 and 2006 from 501 public and private hospitals in NSW, Australia were used for training and 2007 data for evaluation. The impact of hospital care delivered over different days of the week and or times of the day was modeled by separating hospitalization risk into 21 separate time periods (morning, day, night across the days of the week). Three models were developed to predict death up to 7-days post-discharge: 1/a simple background risk model using age, gender; 2/a time-varying risk model for exposure to hospitalization (admission time, days in hospital); 3/disease specific models (Charlson co-morbidity index, DRG). Combining these three generated a full model. Models were evaluated by accuracy, AUC, Akaike and Bayesian information criteria. RESULTS: There was a clear diurnal rhythm to hospital mortality in the data set, peaking in the evening, as well as the well-known 'weekend-effect' where mortality peaks with weekend admissions. Individual models had modest performance on the test data set (AUC 0.71, 0.79 and 0.79 respectively). The combined model which included time-varying risk however yielded an average AUC of 0.92. This model performed best for stays up to 7-days (93% of admissions), peaking at days 3 to 5 (AUC 0.94). CONCLUSIONS: Risks of hospitalization vary not just with the day of the week but also time of the day, and can be used to make predictions about the cumulative risk of death associated with an individual's hospitalization. Combining disease specific models with such time varying- estimates appears to result in robust predictive performance. Such risk exposure models should find utility both in enhancing standard prognostic models as well as estimating the risk of continuation of hospitalization.

Assuntos

Mortalidade Hospitalar , Modelos Teóricos , Risco , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Criança , Pré-Escolar , Ritmo Circadiano , Feminino , Humanos , Lactente , Recém-Nascido , Masculino , Pessoa de Meia-Idade , New South Wales/epidemiologia , Medição de Risco/métodos , Fatores de Tempo , Adulto Jovem

18.

Assessing the Safety of a New Clinical Decision Support System for a National Helpline.

Luckraj, Nirvana; Strazzari, Renee; Coiera, Enrico; Magrabi, Farah.

Stud Health Technol Inform ; 310: 514-518, 2024 Jan 25.

Artigo em Inglês | MEDLINE | ID: mdl-38269862

RESUMO

We assessed the safety of a new clinical decision support system (CDSS) for nurses on Australia's national consumer helpline. Accuracy and safety of triage advice was assessed by testing the CDSS using 78 standardised patient vignettes (48 published and 30 proprietary). Testing was undertaken in two cycles using the CDSS vendor's online evaluation tool (Cycle 1: 47 vignettes; Cycle 2: 41 vignettes). Safety equivalence was examined by testing the existing CDSS with the 47 vignettes from Cycle 1. The new CDSS triaged 66% of vignettes correctly compared to 57% by the existing CDSS. 15% of vignettes were overtriaged by the new CDSS compared to 28% by the existing CDSS. 19% of vignettes were undertriaged by the new CDSS compared to 15% by the existing CDSS. Overall performance of the new CDSS appears consistent and comparable with current studies. The new CDSS is at least as safe as the old CDSS.

Assuntos

Sistemas de Apoio a Decisões Clínicas , Humanos , Sistemas Inteligentes , Software , Triagem

19.

Automating the Identification of Safety Events Involving Machine Learning-Enabled Medical Devices.

Wang, Ying; Lyell, David; Coiera, Enrico; Magrabi, Farah.

Stud Health Technol Inform ; 310: 604-608, 2024 Jan 25.

Artigo em Inglês | MEDLINE | ID: mdl-38269880

RESUMO

With growing use of machine learning (ML)-enabled medical devices by clinicians and consumers safety events involving these systems are emerging. Current analysis of safety events heavily relies on retrospective review by experts, which is time consuming and cost ineffective. This study develops automated text classifiers and evaluates their potential to identify rare ML safety events from the US FDA's MAUDE. Four stratified classifiers were evaluated using a real-world data distribution with different feature sets: report text; text and device brand name; text and generic device type; and all information combined. We found that stratified classifiers using the generic type of devices were the most effective technique when tested on both stratified (F1-score=85%) and external datasets (precision=100%). All true positives on the external dataset were consistently identified by the three stratified classifiers, indicating the ensemble results from them can be used directly to monitor ML events reported to MAUDE.

Assuntos

Medicamentos Genéricos , Aprendizado de Máquina

20.

Using Clinical Simulation to Evaluate AI-Enabled Decision Support.

Lyell, David; Lustig, Adriaan; Denyer, Kate; Vedantam, Satya; Magrabi, Farah.

Stud Health Technol Inform ; 310: 299-303, 2024 Jan 25.

Artigo em Inglês | MEDLINE | ID: mdl-38269813

RESUMO

Clinical simulation is a useful method for evaluating AI-enabled clinical decision support (CDS). Simulation studies permit patient- and risk-free evaluation and far greater experimental control than is possible with clinical studies. The effect of CDS assisted and unassisted patient scenarios on meaningful downstream decisions and actions within the information value chain can be evaluated as outcome measures. This paper discusses the use of clinical simulation in CDS evaluation and presents a case study to demonstrate feasibility of its application.

Assuntos

Inteligência Artificial , Humanos , Simulação por Computador

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA