RESUMO
Validation metrics are key for tracking scientific progress and bridging the current chasm between artificial intelligence research and its translation into practice. However, increasing evidence shows that, particularly in image analysis, metrics are often chosen inadequately. Although taking into account the individual strengths, weaknesses and limitations of validation metrics is a critical prerequisite to making educated choices, the relevant knowledge is currently scattered and poorly accessible to individual researchers. Based on a multistage Delphi process conducted by a multidisciplinary expert consortium as well as extensive community feedback, the present work provides a reliable and comprehensive common point of access to information on pitfalls related to validation metrics in image analysis. Although focused on biomedical image analysis, the addressed pitfalls generalize across application domains and are categorized according to a newly created, domain-agnostic taxonomy. The work serves to enhance global comprehension of a key topic in image analysis validation.
Assuntos
Inteligência ArtificialRESUMO
Increasing evidence shows that flaws in machine learning (ML) algorithm validation are an underestimated global problem. In biomedical image analysis, chosen performance metrics often do not reflect the domain interest, and thus fail to adequately measure scientific progress and hinder translation of ML techniques into practice. To overcome this, we created Metrics Reloaded, a comprehensive framework guiding researchers in the problem-aware selection of metrics. Developed by a large international consortium in a multistage Delphi process, it is based on the novel concept of a problem fingerprint-a structured representation of the given problem that captures all aspects that are relevant for metric selection, from the domain interest to the properties of the target structure(s), dataset and algorithm output. On the basis of the problem fingerprint, users are guided through the process of choosing and applying appropriate validation metrics while being made aware of potential pitfalls. Metrics Reloaded targets image analysis problems that can be interpreted as classification tasks at image, object or pixel level, namely image-level classification, object detection, semantic segmentation and instance segmentation tasks. To improve the user experience, we implemented the framework in the Metrics Reloaded online tool. Following the convergence of ML methodology across application domains, Metrics Reloaded fosters the convergence of validation methodology. Its applicability is demonstrated for various biomedical use cases.
Assuntos
Algoritmos , Processamento de Imagem Assistida por Computador , Aprendizado de Máquina , SemânticaRESUMO
BACKGROUND: Gaming can serve as an educational tool to allow trainees to practice surgical decision-making in a low-stakes environment. LapBot is a novel free interactive mobile game application that uses artificial intelligence (AI) to provide players with feedback on safe dissection during laparoscopic cholecystectomy (LC). This study aims to provide validity evidence for this mobile game. METHODS: Trainees and surgeons participated by downloading and playing LapBot on their smartphone. Players were presented with intraoperative LC scenes and required to locate their preferred location of dissection of the hepatocystic triangle. They received immediate accuracy scores and personalized feedback using an AI algorithm ("GoNoGoNet") that identifies safe/dangerous zones of dissection. Player scores were assessed globally and across training experience using non-parametric ANOVA. Three-month questionnaires were administered to assess the educational value of LapBot. RESULTS: A total of 903 participants from 64 countries played LapBot. As game difficulty increased, average scores (p < 0.0001) and confidence levels (p < 0.0001) decreased significantly. Scores were significantly positively correlated with players' case volume (p = 0.0002) and training level (p = 0.0003). Most agreed that LapBot should be incorporated as an adjunct into training programs (64.1%), as it improved their ability to reflect critically on feedback they receive during LC (47.5%) or while watching others perform LC (57.5%). CONCLUSIONS: Serious games, such as LapBot, can be effective educational tools for deliberate practice and surgical coaching by promoting learner engagement and experiential learning. Our study demonstrates that players' scores were correlated to their level of expertise, and that after playing the game, most players perceived a significant educational value.
Assuntos
Inteligência Artificial , Colecistectomia Laparoscópica , Competência Clínica , Aplicativos Móveis , Humanos , Colecistectomia Laparoscópica/educação , Masculino , Feminino , Internato e Residência/métodos , Jogos de Vídeo , Adulto , Educação de Pós-Graduação em Medicina/métodosRESUMO
INTRODUCTION: Communication is fundamental to effective surgical coaching. This can be challenging for training during image-guided procedures where coaches and trainees need to articulate technical details on a monitor. Telestration devices that annotate on monitors remotely could potentially overcome these limitations and enhance the coaching experience. This study aims to evaluate the value of a novel telestration device in surgical coaching. METHODS: A randomized-controlled trial was designed. All participants watched a video demonstrating the task followed by a baseline performance assessment and randomization into either control group (conventional verbal coaching without telestration) or telestration group (verbal coaching with telestration). Coaching for a simulated laparoscopic small bowel anastomosis on a dry lab model was done by a faculty surgeon. Following the coaching session, participants underwent a post-coaching performance assessment of the same task. Assessments were recorded and rated by blinded reviewers using a modified Global Rating Scale of the Objective Structured Assessment of Technical Skills (OSATS). Coaching sessions were also recorded and compared in terms of mentoring moments; guidance misinterpretations, questions/clarifications by trainees, and task completion time. A 5-point Likert scale was administered to obtain feedback. RESULTS: Twenty-four residents participated (control group 13, telestration group 11). Improvements in some elements of the OSATS scale were noted in the Telestration arm but there was no statistical significance in the overall score between the two groups. Mentoring moments were more in the telestration Group. Amongst the telestration Group, 55% felt comfortable that they could perform this task independently, compared to only 8% amongst the control group and 82% would recommend the use of telestration tools here. CONCLUSION: There is demonstrated educational value of this novel telestration device mainly in the non-technical aspects of the interaction by enhancing the coaching experience with improvement in communication and greater mentoring moments between coach and trainee.
Assuntos
Competência Clínica , Internato e Residência , Tutoria , Humanos , Tutoria/métodos , Internato e Residência/métodos , Masculino , Feminino , Laparoscopia/educação , Adulto , Anastomose Cirúrgica/educação , Treinamento por Simulação/métodos , Intestino Delgado/cirurgiaRESUMO
BACKGROUND: The learning curve in minimally invasive surgery (MIS) is lengthened compared to open surgery. It has been reported that structured feedback and training in teams of two trainees improves MIS training and MIS performance. Annotation of surgical images and videos may prove beneficial for surgical training. This study investigated whether structured feedback and video debriefing, including annotation of critical view of safety (CVS), have beneficial learning effects in a predefined, multi-modal MIS training curriculum in teams of two trainees. METHODS: This randomized-controlled single-center study included medical students without MIS experience (n = 80). The participants first completed a standardized and structured multi-modal MIS training curriculum. They were then randomly divided into two groups (n = 40 each), and four laparoscopic cholecystectomies (LCs) were performed on ex-vivo porcine livers each. Students in the intervention group received structured feedback after each LC, consisting of LC performance evaluations through tutor-trainee joint video debriefing and CVS video annotation. Performance was evaluated using global and LC-specific Objective Structured Assessments of Technical Skills (OSATS) and Global Operative Assessment of Laparoscopic Skills (GOALS) scores. RESULTS: The participants in the intervention group had higher global and LC-specific OSATS as well as global and LC-specific GOALS scores than the participants in the control group (25.5 ± 7.3 vs. 23.4 ± 5.1, p = 0.003; 47.6 ± 12.9 vs. 36 ± 12.8, p < 0.001; 17.5 ± 4.4 vs. 16 ± 3.8, p < 0.001; 6.6 ± 2.3 vs. 5.9 ± 2.1, p = 0.005). The intervention group achieved CVS more often than the control group (1. LC: 20 vs. 10 participants, p = 0.037, 2. LC: 24 vs. 8, p = 0.001, 3. LC: 31 vs. 8, p < 0.001, 4. LC: 31 vs. 10, p < 0.001). CONCLUSIONS: Structured feedback and video debriefing with CVS annotation improves CVS achievement and ex-vivo porcine LC training performance based on OSATS and GOALS scores.
Assuntos
Colecistectomia Laparoscópica , Competência Clínica , Gravação em Vídeo , Colecistectomia Laparoscópica/educação , Humanos , Suínos , Animais , Feminino , Masculino , Curva de Aprendizado , Currículo , Adulto , Estudantes de Medicina , Feedback Formativo , Adulto Jovem , RetroalimentaçãoRESUMO
BACKGROUND: Digital surgery is a new paradigm within the surgical innovation space that is rapidly advancing and encompasses multiple areas. METHODS: This white paper from the SAGES Digital Surgery Working Group outlines the scope of digital surgery, defines key terms, and analyzes the challenges and opportunities surrounding this disruptive technology. RESULTS: In its simplest form, digital surgery inserts a computer interface between surgeon and patient. We divide the digital surgery space into the following elements: advanced visualization, enhanced instrumentation, data capture, data analytics with artificial intelligence/machine learning, connectivity via telepresence, and robotic surgical platforms. We will define each area, describe specific terminology, review current advances as well as discuss limitations and opportunities for future growth. CONCLUSION: Digital Surgery will continue to evolve and has great potential to bring value to all levels of the healthcare system. The surgical community has an essential role in understanding, developing, and guiding this emerging field.
Assuntos
Procedimentos Cirúrgicos Robóticos , Cirurgiões , Humanos , Inteligência Artificial , Aprendizado de Máquina , PrevisõesRESUMO
BACKGROUND: Adverse events during surgery can occur in part due to errors in visual perception and judgment. Deep learning is a branch of artificial intelligence (AI) that has shown promise in providing real-time intraoperative guidance. This study aims to train and test the performance of a deep learning model that can identify inappropriate landing zones during endovascular aneurysm repair (EVAR). METHODS: A deep learning model was trained to identify a "No-Go" landing zone during EVAR, defined by coverage of the lowest renal artery by the stent graft. Fluoroscopic images from elective EVAR procedures performed at a single institution and from open-access sources were selected. Annotations of the "No-Go" zone were performed by trained annotators. A 10-fold cross-validation technique was used to evaluate the performance of the model against human annotations. Primary outcomes were intersection-over-union (IoU) and F1 score and secondary outcomes were pixel-wise accuracy, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). RESULTS: The AI model was trained using 369 images procured from 110 different patients/videos, including 18 patients/videos (44 images) from open-access sources. For the primary outcomes, IoU and F1 were 0.43 (standard deviation ± 0.29) and 0.53 (±0.32), respectively. For the secondary outcomes, accuracy, sensitivity, specificity, NPV, and PPV were 0.97 (±0.002), 0.51 (±0.34), 0.99 (±0.001). 0.99 (±0.002), and 0.62 (±0.34), respectively. CONCLUSIONS: AI can effectively identify suboptimal areas of stent deployment during EVAR. Further directions include validating the model on datasets from other institutions and assessing its ability to predict optimal stent graft placement and clinical outcomes.
Assuntos
Aneurisma da Aorta Abdominal , Implante de Prótese Vascular , Procedimentos Endovasculares , Humanos , Aneurisma da Aorta Abdominal/diagnóstico por imagem , Aneurisma da Aorta Abdominal/cirurgia , Aneurisma da Aorta Abdominal/etiologia , Implante de Prótese Vascular/efeitos adversos , Implante de Prótese Vascular/métodos , Resultado do Tratamento , Inteligência Artificial , Procedimentos Endovasculares/efeitos adversos , Procedimentos Endovasculares/métodos , Stents , Estudos Retrospectivos , Prótese VascularRESUMO
BACKGROUND: The safe and effective performance of a posterior component separation via a transversus abdominis release (TAR) requires intraoperative judgement and decision-making skills that are difficult to define, standardize, and teach. We herein present the first qualitative study which builds a framework upon which training and objective evaluation of a TAR can be based. METHODS: Hierarchical and cognitive task analyses for a TAR procedure were performed using semistructured interviews of hernia experts to describe the thoughts and behaviors that exemplify optimal performance. Verbal data was recorded, transcribed, coded, and thematically analyzed. RESULTS: A conceptual framework was synthesized based on literary sources (4 book chapters, 4 peer-reviewed articles, 3 online videos), 2 field observations, and interviews of 4 hernia experts [median 66 minutes (44-78)]. Subject matter experts practiced a median of 6.5 years (1.5-16) and have completed a median of 300 (60-500) TARs. After 5 rounds of inductive analysis, 80 subtasks, 86 potential errors, 36 cognitive behaviors, and 17 decision points were identified and categorized into 10 procedural steps (midline laparotomy, adhesiolysis, retrorectus dissection, etc.) and 9 fundamental principles: patient physiology and disease burden; tactical modification; tissue reconstruction and wound healing; task completion; choice of technique and instruments; safe planes and danger zones; exposure, ergonomics, environmental limitations; anticipation and forward planning; and tissue trauma and handling. CONCLUSION: This is the first study to define the key tasks, decisions, and cognitive behaviors that are essential to a successful TAR procedure.
Assuntos
Parede Abdominal , Hérnia Ventral , Humanos , Músculos Abdominais/cirurgia , Hérnia Ventral/cirurgia , Laparotomia , Herniorrafia/métodos , Telas CirúrgicasRESUMO
INTRODUCTION: Continuing Professional Development opportunities for lifelong learning are fundamental to the acquisition of surgical expertise. However, few opportunities exist for longitudinal and structured learning to support the educational needs of surgeons in practice. While peer-to-peer coaching has been proposed as a potential solution, there remains significant logistical constraints and a lack of evidence to support its effectiveness. The purpose of this study is to determine whether the use of remote videoconferencing for video-based coaching improves operative performance. METHODS: Early career surgeon mentees participated in a remote coaching intervention with a surgeon coach of their choice and using a virtual telestration platform (Zoom Video Communications, San Jose, CA). Feedback was articulated through annotating videos. The coach evaluated mentee performance using a modified Intraoperative Performance Assessment Tool (IPAT). Participants completed a 5-point Likert scale on the educational value of the coaching program. RESULTS: Eight surgeons were enrolled in the study, six of whom completed a total of two coaching sessions (baseline, 6-month). Subspecialties included endocrine, hepatopancreatobiliary, and surgical oncology. Mean age of participants was 39 (SD 3.3), with mean 5 (SD 4.1) years in independent practice. Total IPAT scores increased significantly from the first session (mean 47.0, SD 1.9) to the second session (mean 51.8, SD 2.1), p = 0.03. Sub-category analysis showed a significant improvement in the Advanced Cognitive Skills domain with a mean of 33.2 (SD 2.5) versus a mean of 37.0 (SD 2.4), p < 0.01. There was no improvement in the psychomotor skills category. Participants agreed or strongly agreed that the coaching programs can improve surgical performance and decision-making (coaches 85%; mentees 100%). CONCLUSION: Remote surgical coaching is feasible and has educational value using ubiquitous commercially available virtual platforms. Logistical issues with scheduling and finding cases aligned with learning objectives continue to challenge program adoption and widespread dissemination.
Assuntos
Tutoria , Cirurgiões , Humanos , Cirurgiões/educação , Aprendizagem , EscolaridadeRESUMO
INTRODUCTION: Bile duct injuries (BDIs) are a significant source of morbidity among patients undergoing laparoscopic cholecystectomy (LC). GoNoGoNet is an artificial intelligence (AI) algorithm that has been developed and validated to identify safe ("Go") and dangerous ("No-Go") zones of dissection during LC, with the potential to prevent BDIs through real-time intraoperative decision-support. This study evaluates GoNoGoNet's ability to predict Go/No-Go zones during LCs with BDIs. METHODS AND PROCEDURES: Eleven LC videos with BDI (BDI group) were annotated by GoNoGoNet. All tool-tissue interactions, including the one that caused the BDI, were characterized in relation to the algorithm's predicted location of Go/No-Go zones. These were compared to another 11 LC videos with cholecystitis (control group) deemed to represent "safe cholecystectomy" by experts. The probability threshold of GoNoGoNet annotations were then modulated to determine its relationship to Go/No-Go predictions. Data is shown as % difference [99% confidence interval]. RESULTS: Compared to control, the BDI group showed significantly greater proportion of sharp dissection (+ 23.5% [20.0-27.0]), blunt dissection (+ 32.1% [27.2-37.0]), and total interactions (+ 33.6% [31.0-36.2]) outside of the Go zone. Among injury-causing interactions, 4 (36%) were in the No-Go zone, 2 (18%) were in the Go zone, and 5 (45%) were outside both zones, after maximizing the probability threshold of the Go algorithm. CONCLUSION: AI has potential to detect unsafe dissection and prevent BDIs through real-time intraoperative decision-support. More work is needed to determine how to optimize integration of this technology into the operating room workflow and adoption by end-users.
Assuntos
Doenças dos Ductos Biliares , Colecistectomia Laparoscópica , Humanos , Colecistectomia Laparoscópica/métodos , Ductos Biliares/lesões , Inteligência Artificial , Colecistectomia/métodos , Doenças dos Ductos Biliares/cirurgia , Assunção de RiscosRESUMO
BACKGROUND: Surgical video recording provides the opportunity to acquire intraoperative data that can subsequently be used for a variety of quality improvement, research, and educational applications. Various recording devices are available for standard operating room camera systems. Some allow for collateral data acquisition including activities of the OR staff, kinematic measurements (motion of surgical instruments), and recording of the endoscopic video streams. Additional analysis through computer vision (CV), which allows software to understand and perform predictive tasks on images, can allow for automatic phase segmentation, instrument tracking, and derivative performance-geared metrics. With this survey, we summarize available surgical video acquisition technologies and associated performance analysis platforms. METHODS: In an effort promoted by the SAGES Artificial Intelligence Task Force, we surveyed the available video recording technology companies. Of thirteen companies approached, nine were interviewed, each over an hour-long video conference. A standard set of 17 questions was administered. Questions spanned from data acquisition capacity, quality, and synchronization of video with other data, availability of analytic tools, privacy, and access. RESULTS: Most platforms (89%) store video in full-HD (1080p) resolution at a frame rate of 30 fps. Most (67%) of available platforms store data in a Cloud-based databank as opposed to institutional hard drives. CV powered analysis is featured in some platforms: phase segmentation in 44% platforms, out of body blurring or tool tracking in 33%, and suture time in 11%. Kinematic data are provided by 22% and perfusion imaging in one device. CONCLUSION: Video acquisition platforms on the market allow for in depth performance analysis through manual and automated review. Most of these devices will be integrated in upcoming robotic surgical platforms. Platform analytic supplementation, including CV, may allow for more refined performance analysis to surgeons and trainees. Most current AI features are related to phase segmentation, instrument tracking, and video blurring.
Assuntos
Inteligência Artificial , Procedimentos Cirúrgicos Robóticos , Humanos , Endoscopia , Software , Privacidade , Gravação em VídeoRESUMO
BACKGROUND: Many surgical adverse events, such as bile duct injuries during laparoscopic cholecystectomy (LC), occur due to errors in visual perception and judgment. Artificial intelligence (AI) can potentially improve the quality and safety of surgery, such as through real-time intraoperative decision support. GoNoGoNet is a novel AI model capable of identifying safe ("Go") and dangerous ("No-Go") zones of dissection on surgical videos of LC. Yet, it is unknown how GoNoGoNet performs in comparison to expert surgeons. This study aims to evaluate the GoNoGoNet's ability to identify Go and No-Go zones compared to an external panel of expert surgeons. METHODS: A panel of high-volume surgeons from the SAGES Safe Cholecystectomy Task Force was recruited to draw free-hand annotations on frames of prospectively collected videos of LC to identify the Go and No-Go zones. Expert consensus on the location of Go and No-Go zones was established using Visual Concordance Test pixel agreement. Identification of Go and No-Go zones by GoNoGoNet was compared to expert-derived consensus using mean F1 Dice Score, and pixel accuracy, sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV). RESULTS: A total of 47 frames from 25 LC videos, procured from 3 countries and 9 surgeons, were annotated simultaneously by an expert panel of 6 surgeons and GoNoGoNet. Mean (± standard deviation) F1 Dice score were 0.58 (0.22) and 0.80 (0.12) for Go and No-Go zones, respectively. Mean (± standard deviation) accuracy, sensitivity, specificity, PPV and NPV for the Go zones were 0.92 (0.05), 0.52 (0.24), 0.97 (0.03), 0.70 (0.21), and 0.94 (0.04) respectively. For No-Go zones, these metrics were 0.92 (0.05), 0.80 (0.17), 0.95 (0.04), 0.84 (0.13) and 0.95 (0.05), respectively. CONCLUSIONS: AI can be used to identify safe and dangerous zones of dissection within the surgical field, with high specificity/PPV for Go zones and high sensitivity/NPV for No-Go zones. Overall, model prediction was better for No-Go zones compared to Go zones. This technology may eventually be used to provide real-time guidance and minimize the risk of adverse events.
Assuntos
Colecistectomia Laparoscópica , Cirurgiões , Humanos , Colecistectomia Laparoscópica/efeitos adversos , Inteligência Artificial , Coleta de Dados , ColecistectomiaRESUMO
INTRODUCTION: Surgical complications often occur due to lapses in judgment and decision-making. Advances in artificial intelligence (AI) have made it possible to train algorithms that identify anatomy and interpret the surgical field. These algorithms can potentially be used for intraoperative decision-support and postoperative video analysis and feedback. Despite the very early success of proof-of-concept algorithms, it remains unknown whether this innovation meets the needs of end-users or how best to deploy it. This study explores users' opinion on the value, usability and design for adapting AI in operating rooms. METHODS: A device-agnostic web-accessible software was developed to provide AI inference either (1) intraoperatively on a live video stream (synchronous mode), or (2) on an uploaded video or image file (asynchronous mode) postoperatively for feedback. A validated AI model (GoNoGoNet), which identifies safe and dangerous zones of dissection during laparoscopic cholecystectomy, was used as the use case. Surgeons and trainees performing laparoscopic cholecystectomy interacted with the AI platform and completed a 5-point Likert scale survey to evaluate the educational value, usability and design of the platform. RESULTS: Twenty participants (11 surgeons and 9 trainees) evaluated the platform intraoperatively (n = 10) and postoperatively (n = 11). The majority agreed or strongly agreed that AI is an effective adjunct to surgical training (81%; neutral = 10%), effective for providing real-time feedback (70%; neutral = 20%), postoperative feedback (73%; neutral = 27%), and capable of improving surgeon confidence (67%; neutral = 29%). Only 40% (neutral = 50%) and 57% (neutral = 43%) believe that the tool is effective in improving intraoperative decisions and performance, or beneficial for patient care, respectively. Overall, 38% (neutral = 43%) reported they would use this platform consistently if available. The majority agreed or strongly agreed that the platform was easy to use (81%; neutral = 14%) and has acceptable resolution (62%; neutral = 24%), while 30% (neutral = 20%) reported that it disrupted the OR workflow, and 20% (neutral = 0%) reported significant time lag. All respondents reported that such a system should be available "on-demand" to turn on/off at their discretion. CONCLUSIONS: Most found AI to be a useful tool for providing support and feedback to surgeons, despite several implementation obstacles. The study findings will inform the future design and usability of this technology in order to optimize its clinical impact and adoption by end-users.
Assuntos
Inteligência Artificial , Cirurgiões , Humanos , Escolaridade , Algoritmos , SoftwareRESUMO
BACKGROUND: Despite the advantages of laparoscopic cholecystectomy, major bile duct injury (BDI) rates during this operation remain unacceptably high. In October 2018, SAGES released the Safe Cholecystectomy modules, which define specific strategies to minimize the risk of BDI. This study aims to investigate whether this curriculum can change the knowledge and behaviors of surgeons in practice. METHODS: Practicing surgeons were recruited from the membership of SAGES and the American College of Surgeons Advisory Council for Rural Surgery. All participants completed a baseline assessment (pre-test) that involved interpreting cholangiograms, troubleshooting difficult cases, and managing BDI. Participants' dissection strategies during cholecystectomy were also compared to the strategies of a panel of 15 experts based on accuracy scores using the Think Like a Surgeon validated web-based platform. Participants were then randomized to complete the Safe Cholecystectomy modules (Safe Chole module group) or participate in usually scheduled CME activities (control group). Both groups completed repeat assessments (post-tests) one month after randomization. RESULTS: Overall, 41 participants were eligible for analysis, including 18 Safe Chole module participants and 23 controls. The two groups had no significant differences in pre-test scores. However, at post-test, Safe Chole module participants made significantly fewer errors managing BDI and interpreting cholangiograms. Safe Chole module participants were less likely to convert to an open operation on the post-test than controls when facing challenging dissections. However, Safe Chole module participants displayed a similar incidence of errors when evaluating adequate critical views of safety. CONCLUSIONS: In this randomized-controlled trial, the SAGES Safe Cholecystectomy modules improved surgeons' abilities to interpret cholangiograms and safely manage BDI. Additionally, surgeons who studied the modules were less likely to convert to open during difficult dissections. These data show the power of the Safe Cholecystectomy modules to affect practicing surgeons' behaviors in a measurable and meaningful way.
Assuntos
Traumatismos Abdominais , Doenças dos Ductos Biliares , Colecistectomia Laparoscópica , Cirurgiões , Humanos , Ductos Biliares/lesões , Julgamento , Complicações Intraoperatórias/epidemiologia , Colecistectomia , Colecistectomia Laparoscópica/métodosRESUMO
BACKGROUND: Surgery generates a vast amount of data from each procedure. Particularly video data provides significant value for surgical research, clinical outcome assessment, quality control, and education. The data lifecycle is influenced by various factors, including data structure, acquisition, storage, and sharing; data use and exploration, and finally data governance, which encompasses all ethical and legal regulations associated with the data. There is a universal need among stakeholders in surgical data science to establish standardized frameworks that address all aspects of this lifecycle to ensure data quality and purpose. METHODS: Working groups were formed, among 48 representatives from academia and industry, including clinicians, computer scientists and industry representatives. These working groups focused on: Data Use, Data Structure, Data Exploration, and Data Governance. After working group and panel discussions, a modified Delphi process was conducted. RESULTS: The resulting Delphi consensus provides conceptualized and structured recommendations for each domain related to surgical video data. We identified the key stakeholders within the data lifecycle and formulated comprehensive, easily understandable, and widely applicable guidelines for data utilization. Standardization of data structure should encompass format and quality, data sources, documentation, metadata, and account for biases within the data. To foster scientific data exploration, datasets should reflect diversity and remain adaptable to future applications. Data governance must be transparent to all stakeholders, addressing legal and ethical considerations surrounding the data. CONCLUSION: This consensus presents essential recommendations around the generation of standardized and diverse surgical video databanks, accounting for multiple stakeholders involved in data generation and use throughout its lifecycle. Following the SAGES annotation framework, we lay the foundation for standardization of data use, structure, and exploration. A detailed exploration of requirements for adequate data governance will follow.
Assuntos
Inteligência Artificial , Melhoria de Qualidade , Humanos , Consenso , Coleta de DadosRESUMO
OBJECTIVE: The aim of this study was to develop and evaluate the performance of artificial intelligence (AI) models that can identify safe and dangerous zones of dissection, and anatomical landmarks during laparoscopic cholecystectomy (LC). SUMMARY BACKGROUND DATA: Many adverse events during surgery occur due to errors in visual perception and judgment leading to misinterpretation of anatomy. Deep learning, a subfield of AI, can potentially be used to provide real-time guidance intraoperatively. METHODS: Deep learning models were developed and trained to identify safe (Go) and dangerous (No-Go) zones of dissection, liver, gallbladder, and hepatocystic triangle during LC. Annotations were performed by 4 high-volume surgeons. AI predictions were evaluated using 10-fold cross-validation against annotations by expert surgeons. Primary outcomes were intersection- over-union (IOU) and F1 score (validated spatial correlation indices), and secondary outcomes were pixel-wise accuracy, sensitivity, specificity, ± standard deviation. RESULTS: AI models were trained on 2627 random frames from 290 LC videos, procured from 37 countries, 136 institutions, and 153 surgeons. Mean IOU, F1 score, accuracy, sensitivity, and specificity for the AI to identify Go zones were 0.53 (±0.24), 0.70 (±0.28), 0.94 (±0.05), 0.69 (±0.20). and 0.94 (±0.03), respectively. For No-Go zones, these metrics were 0.71 (±0.29), 0.83 (±0.31), 0.95 (±0.06), 0.80 (±0.21), and 0.98 (±0.05), respectively. Mean IOU for identification of the liver, gallbladder, and hepatocystic triangle were: 0.86 (±0.12), 0.72 (±0.19), and 0.65 (±0.22), respectively. CONCLUSIONS: AI can be used to identify anatomy within the surgical field. This technology may eventually be used to provide real-time guidance and minimize the risk of adverse events.
Assuntos
Colecistectomia Laparoscópica , Cirurgiões , Inteligência Artificial , Colecistectomia Laparoscópica/efeitos adversos , Vesícula Biliar/cirurgia , Humanos , SemânticaRESUMO
BACKGROUND: Transanal total mesorectal excision (TATME) is difficult to learn and can result in serious complications. Current paradigms for assessing performance and competency may be insufficient. This study aims to develop and provide preliminary validity evidence for a TATME virtual assessment tool (TATME-VAT) to assess the cognitive skills necessary to safely complete TATME dissection. METHODS: Participants from North America, Europe, Japan and China completed the test via an interactive online platform between 11/2019 and 05/2020. They were grouped into expert, experienced and novice surgeons depending on the number of independently performed TATMEs. TATME-VAT is a 24-item web-based assessment evaluating advanced cognitive skills, designed according to a blueprint from consensus guidelines. Eight items were multiple choice questions. Sixteen items required making annotations on still frames of TATME videos (VCT) and were scored using a validated algorithm derived from experts' responses. Annotation (range 0-100), multiple choice (range 0-100), and overall scores (sum of annotation and multiple-choice scores, normalized to µ = 50 and σ = 10) were reported. RESULTS: There were significant differences between the expert, experienced, and novice groups for the annotation (p < 0.001), multiple-choice (p < 0.001), and overall scores (p < 0.001). The annotation (p = 0.439) and overall (p = 0.152) scores were similar between the experienced and novice groups. Annotation scores were higher in participants with 51 or more vs. 30-50 vs. less than 30 cases. Scores were also lower in users with a self-reported recent complication vs. those without. CONCLUSIONS: This study describes the development of an interactive video-based virtual assessment tool for TATME dissection and provides initial validity evidence for its use.
Assuntos
Laparoscopia , Protectomia , Neoplasias Retais , Cirurgiões , Cirurgia Endoscópica Transanal , Europa (Continente) , Humanos , Laparoscopia/efeitos adversos , Complicações Pós-Operatórias/etiologia , Protectomia/efeitos adversos , Neoplasias Retais/complicações , Neoplasias Retais/cirurgia , Reto/cirurgia , Cirurgia Endoscópica Transanal/efeitos adversosRESUMO
Surgical data science (SDS) aims to improve the quality of interventional healthcare and its value through the capture, organization, analysis, and modeling of procedural data. As data capture has increased and artificial intelligence (AI) has advanced, SDS can help to unlock augmented and automated coaching, feedback, assessment, and decision support in surgery. We review major concepts in SDS and AI as applied to surgical education and surgical oncology.
Assuntos
Inteligência Artificial , Ciência de Dados , Educação de Pós-Graduação em Medicina/métodos , Oncologia Cirúrgica/educação , Competência Clínica , Sistemas de Apoio a Decisões Clínicas , Europa (Continente) , Humanos , América do Norte , Procedimentos Cirúrgicos Operatórios/educação , Procedimentos Cirúrgicos Operatórios/métodosRESUMO
INTRODUCTION: Hospital readmissions constitute an important component of associated costs of a disease and can contribute a significant burden to healthcare. The majority of studies evaluating readmissions following laparoscopic cholecystectomy (LC) comprise of single center studies and thus can underestimate the actual incidence of readmission. We sought to examine the rate and causes of readmissions following LC using a large longitudinal database. METHODS: The New York SPARCS database was used to identify all adult patients undergoing laparoscopic cholecystectomy for benign biliary disease between 2000 and 2016. Due to the presence of a unique identifier, patients with readmission to any New York hospital were evaluated. Planned versus unplanned readmission rates were compared. Following univariate analysis, multivariable logistic regression model was used to identify risk factors for unplanned readmissions after accounting for baseline characteristics, comorbidities and complications. RESULTS: There were 591,627 patients who underwent LC during the studied time period. Overall 30-day readmission rate was 4.94% (n = 29,245) and unplanned 30-days readmission rate was 4.58% (n = 27,084). Female patients were less likely to have 30-day unplanned readmissions. Patients with age older than 65 or younger than 29 were more likely to have 30-day unplanned readmissions compared to patients with age 30-44 or 45-64. Insurance status was also significant, as patients with Medicaid/Medicare were more likely to have unplanned readmissions compared to commercial insurance. In addition, variables such as Black race, presence of any comorbidity, postoperative complication, and prolonged initial hospital length of stay were associated with subsequent readmission. CONCLUSION: This data show that readmissions rates following LC are relatively low; however, majority of readmissions are unplanned. Most common reason for unplanned readmissions was associated with complications of the procedure or medical care. By identifying certain risk groups, unplanned readmissions may be prevented.
Assuntos
Colecistectomia Laparoscópica , Readmissão do Paciente , Adulto , Idoso , Colecistectomia Laparoscópica/efeitos adversos , Feminino , Humanos , Medicare , New York/epidemiologia , Complicações Pós-Operatórias/epidemiologia , Complicações Pós-Operatórias/etiologia , Complicações Pós-Operatórias/cirurgia , Estudos Retrospectivos , Fatores de Risco , Estados UnidosRESUMO
BACKGROUND: The growing interest in analysis of surgical video through machine learning has led to increased research efforts; however, common methods of annotating video data are lacking. There is a need to establish recommendations on the annotation of surgical video data to enable assessment of algorithms and multi-institutional collaboration. METHODS: Four working groups were formed from a pool of participants that included clinicians, engineers, and data scientists. The working groups were focused on four themes: (1) temporal models, (2) actions and tasks, (3) tissue characteristics and general anatomy, and (4) software and data structure. A modified Delphi process was utilized to create a consensus survey based on suggested recommendations from each of the working groups. RESULTS: After three Delphi rounds, consensus was reached on recommendations for annotation within each of these domains. A hierarchy for annotation of temporal events in surgery was established. CONCLUSIONS: While additional work remains to achieve accepted standards for video annotation in surgery, the consensus recommendations on a general framework for annotation presented here lay the foundation for standardization. This type of framework is critical to enabling diverse datasets, performance benchmarks, and collaboration.