Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
J Endourol ; 2024 Jan 29.
Artigo em Inglês | MEDLINE | ID: mdl-37905524

RESUMO

Introduction: Automated skills assessment can provide surgical trainees with objective, personalized feedback during training. Here, we measure the efficacy of artificial intelligence (AI)-based feedback on a robotic suturing task. Materials and Methods: Forty-two participants with no robotic surgical experience were randomized to a control or feedback group and video-recorded while completing two rounds (R1 and R2) of suturing tasks on a da Vinci surgical robot. Participants were assessed on needle handling and needle driving, and feedback was provided via a visual interface after R1. For feedback group, participants were informed of their AI-based skill assessment and presented with specific video clips from R1. For control group, participants were presented with randomly selected video clips from R1 as a placebo. Participants from each group were further labeled as underperformers or innate-performers based on a median split of their technical skill scores from R1. Results: Demographic features were similar between the control (n = 20) and feedback group (n = 22) (p > 0.05). Observing the improvement from R1 to R2, the feedback group had a significantly larger improvement in needle handling score (0.30 vs -0.02, p = 0.018) when compared with the control group, although the improvement of needle driving score was not significant when compared with the control group (0.17 vs -0.40, p = 0.074). All innate-performers exhibited similar improvements across rounds, regardless of feedback (p > 0.05). In contrast, underperformers in the feedback group improved more than the control group in needle handling (p = 0.02). Conclusion: AI-based feedback facilitates surgical trainees' acquisition of robotic technical skills, especially underperformers. Future research will extend AI-based feedback to additional suturing skills, surgical tasks, and experience groups.

2.
J Robot Surg ; 17(2): 597-603, 2023 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-36149590

RESUMO

Our group previously defined a dissection gesture classification system that deconstructs robotic tissue dissection into its most elemental yet meaningful movements. The purpose of this study was to expand upon this framework by adding an assessment of gesture efficacy (ineffective, effective, or erroneous) and analyze dissection patterns between groups of surgeons of varying experience. We defined three possible gesture efficacies as ineffective (no meaningful effect on the tissue), effective (intended effect on the tissue), and erroneous (unintended disruption of the tissue). Novices (0 prior robotic cases), intermediates (1-99 cases), and experts (≥ 100 cases) completed a robotic dissection task in a dry-lab training environment. Video recordings were reviewed to classify each gesture and determine its efficacy, then dissection patterns between groups were analyzed. 23 participants completed the task, with 9 novices, 8 intermediates with median caseload 60 (IQR 41-80), and 6 experts with median caseload 525 (IQR 413-900). For gesture selection, we found increasing experience associated with increasing proportion of overall dissection gestures (p = 0.009) and decreasing proportion of retraction gestures (p = 0.009). For gesture efficacy, novices performed the greatest proportion of ineffective gestures (9.8%, p < 0.001), intermediates commit the greatest proportion of erroneous gestures (26.8%, p < 0.001), and the three groups performed similar proportions of overall effective gestures, though experts performed the greatest proportion of effective retraction gestures (85.6%, p < 0.001). Between groups of experience, we found significant differences in gesture selection and gesture efficacy. These relationships may provide insight into further improving surgical training.


Assuntos
Procedimentos Cirúrgicos Robóticos , Robótica , Humanos , Procedimentos Cirúrgicos Robóticos/métodos , Gestos , Movimento
3.
Commun Med (Lond) ; 3(1): 42, 2023 Mar 30.
Artigo em Inglês | MEDLINE | ID: mdl-36997578

RESUMO

BACKGROUND: Surgeons who receive reliable feedback on their performance quickly master the skills necessary for surgery. Such performance-based feedback can be provided by a recently-developed artificial intelligence (AI) system that assesses a surgeon's skills based on a surgical video while simultaneously highlighting aspects of the video most pertinent to the assessment. However, it remains an open question whether these highlights, or explanations, are equally reliable for all surgeons. METHODS: Here, we systematically quantify the reliability of AI-based explanations on surgical videos from three hospitals across two continents by comparing them to explanations generated by humans experts. To improve the reliability of AI-based explanations, we propose the strategy of training with explanations -TWIX -which uses human explanations as supervision to explicitly teach an AI system to highlight important video frames. RESULTS: We show that while AI-based explanations often align with human explanations, they are not equally reliable for different sub-cohorts of surgeons (e.g., novices vs. experts), a phenomenon we refer to as an explanation bias. We also show that TWIX enhances the reliability of AI-based explanations, mitigates the explanation bias, and improves the performance of AI systems across hospitals. These findings extend to a training environment where medical students can be provided with feedback today. CONCLUSIONS: Our study informs the impending implementation of AI-augmented surgical training and surgeon credentialing programs, and contributes to the safe and fair democratization of surgery.


Surgeons aim to master skills necessary for surgery. One such skill is suturing which involves connecting objects together through a series of stitches. Mastering these surgical skills can be improved by providing surgeons with feedback on the quality of their performance. However, such feedback is often absent from surgical practice. Although performance-based feedback can be provided, in theory, by recently-developed artificial intelligence (AI) systems that use a computational model to assess a surgeon's skill, the reliability of this feedback remains unknown. Here, we compare AI-based feedback to that provided by human experts and demonstrate that they often overlap with one another. We also show that explicitly teaching an AI system to align with human feedback further improves the reliability of AI-based feedback on new videos of surgery. Our findings outline the potential of AI systems to support the training of surgeons by providing feedback that is reliable and focused on a particular skill, and guide programs that give surgeons qualifications by complementing skill assessments with explanations that increase the trustworthiness of such assessments.

4.
NPJ Digit Med ; 6(1): 54, 2023 Mar 30.
Artigo em Inglês | MEDLINE | ID: mdl-36997642

RESUMO

Artificial intelligence (AI) systems can now reliably assess surgeon skills through videos of intraoperative surgical activity. With such systems informing future high-stakes decisions such as whether to credential surgeons and grant them the privilege to operate on patients, it is critical that they treat all surgeons fairly. However, it remains an open question whether surgical AI systems exhibit bias against surgeon sub-cohorts, and, if so, whether such bias can be mitigated. Here, we examine and mitigate the bias exhibited by a family of surgical AI systems-SAIS-deployed on videos of robotic surgeries from three geographically-diverse hospitals (USA and EU). We show that SAIS exhibits an underskilling bias, erroneously downgrading surgical performance, and an overskilling bias, erroneously upgrading surgical performance, at different rates across surgeon sub-cohorts. To mitigate such bias, we leverage a strategy -TWIX-which teaches an AI system to provide a visual explanation for its skill assessment that otherwise would have been provided by human experts. We show that whereas baseline strategies inconsistently mitigate algorithmic bias, TWIX can effectively mitigate the underskilling and overskilling bias while simultaneously improving the performance of these AI systems across hospitals. We discovered that these findings carry over to the training environment where we assess medical students' skills today. Our study is a critical prerequisite to the eventual implementation of AI-augmented global surgeon credentialing programs, ensuring that all surgeons are treated fairly.

5.
Eur Urol Open Sci ; 46: 15-21, 2022 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-36506257

RESUMO

Background: There is no standard for the feedback that an attending surgeon provides to a training surgeon, which may lead to variable outcomes in teaching cases. Objective: To create and administer standardized feedback to medical students in an attempt to improve performance and learning. Design setting and participants: A cohort of 45 medical students was recruited from a single medical school. Participants were randomly assigned to two groups. Both completed two rounds of a robotic surgical dissection task on a da Vinci Xi surgical system. The first round was the baseline assessment. In the second round, one group received feedback and the other served as the control (no feedback). Outcome measurements and statistical analysis: Video from each round was retrospectively reviewed by four blinded raters and given a total error tally (primary outcome) and a technical skills score (Global Evaluative Assessment of Robotic Surgery [GEARS]). Generalized linear models were used for statistical modeling. According to their initial performance, each participant was categorized as either an innate performer or an underperformer, depending on whether their error tally was above or below the median. Results and limitations: In round 2, the intervention group had a larger decrease in error rate than the control group, with a risk ratio (RR) of 1.51 (95% confidence interval [CI] 1.07-2.14; p = 0.02). The intervention group also had a greater increase in GEARS score in comparison to the control group, with a mean group difference of 2.15 (95% CI 0.81-3.49; p < 0.01). The interaction effect between innate performers versus underperformers and the intervention was statistically significant for the error rates, at F(1,38) = 5.16 (p = 0.03). Specifically, the intervention had a statistically significant effect on the error rate for underperformers (RR 2.23, 95% CI 1.37-3.62; p < 0.01) but not for innate performers (RR 1.03, 95% CI 0.63-1.68; p = 0.91). Conclusions: Real-time feedback improved performance globally compared to the control. The benefit of real-time feedback was stronger for underperformers than for trainees with innate skill. Patient summary: We found that real-time feedback during a training task using a surgical robot improved the performance of trainees when the task was repeated. This feedback approach could help in training doctors in robotic surgery.

6.
NPJ Digit Med ; 5(1): 187, 2022 Dec 22.
Artigo em Inglês | MEDLINE | ID: mdl-36550203

RESUMO

How well a surgery is performed impacts a patient's outcomes; however, objective quantification of performance remains an unsolved challenge. Deconstructing a procedure into discrete instrument-tissue "gestures" is a emerging way to understand surgery. To establish this paradigm in a procedure where performance is the most important factor for patient outcomes, we identify 34,323 individual gestures performed in 80 nerve-sparing robot-assisted radical prostatectomies from two international medical centers. Gestures are classified into nine distinct dissection gestures (e.g., hot cut) and four supporting gestures (e.g., retraction). Our primary outcome is to identify factors impacting a patient's 1-year erectile function (EF) recovery after radical prostatectomy. We find that less use of hot cut and more use of peel/push are statistically associated with better chance of 1-year EF recovery. Our results also show interactions between surgeon experience and gesture types-similar gesture selection resulted in different EF recovery rates dependent on surgeon experience. To further validate this framework, two teams independently constructe distinct machine learning models using gesture sequences vs. traditional clinical features to predict 1-year EF. In both models, gesture sequences are able to better predict 1-year EF (Team 1: AUC 0.77, 95% CI 0.73-0.81; Team 2: AUC 0.68, 95% CI 0.66-0.70) than traditional clinical features (Team 1: AUC 0.69, 95% CI 0.65-0.73; Team 2: AUC 0.65, 95% CI 0.62-0.68). Our results suggest that gestures provide a granular method to objectively indicate surgical performance and outcomes. Application of this methodology to other surgeries may lead to discoveries on methods to improve surgery.

SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa