Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
1.
Surg Endosc ; 37(11): 8690-8707, 2023 11.
Artigo em Inglês | MEDLINE | ID: mdl-37516693

RESUMO

BACKGROUND: Surgery generates a vast amount of data from each procedure. Particularly video data provides significant value for surgical research, clinical outcome assessment, quality control, and education. The data lifecycle is influenced by various factors, including data structure, acquisition, storage, and sharing; data use and exploration, and finally data governance, which encompasses all ethical and legal regulations associated with the data. There is a universal need among stakeholders in surgical data science to establish standardized frameworks that address all aspects of this lifecycle to ensure data quality and purpose. METHODS: Working groups were formed, among 48 representatives from academia and industry, including clinicians, computer scientists and industry representatives. These working groups focused on: Data Use, Data Structure, Data Exploration, and Data Governance. After working group and panel discussions, a modified Delphi process was conducted. RESULTS: The resulting Delphi consensus provides conceptualized and structured recommendations for each domain related to surgical video data. We identified the key stakeholders within the data lifecycle and formulated comprehensive, easily understandable, and widely applicable guidelines for data utilization. Standardization of data structure should encompass format and quality, data sources, documentation, metadata, and account for biases within the data. To foster scientific data exploration, datasets should reflect diversity and remain adaptable to future applications. Data governance must be transparent to all stakeholders, addressing legal and ethical considerations surrounding the data. CONCLUSION: This consensus presents essential recommendations around the generation of standardized and diverse surgical video databanks, accounting for multiple stakeholders involved in data generation and use throughout its lifecycle. Following the SAGES annotation framework, we lay the foundation for standardization of data use, structure, and exploration. A detailed exploration of requirements for adequate data governance will follow.


Assuntos
Inteligência Artificial , Melhoria de Qualidade , Humanos , Consenso , Coleta de Dados
2.
Surg Endosc ; 36(9): 6832-6840, 2022 09.
Artigo em Inglês | MEDLINE | ID: mdl-35031869

RESUMO

BACKGROUND: Operative courses of laparoscopic cholecystectomies vary widely due to differing pathologies. Efforts to assess intra-operative difficulty include the Parkland grading scale (PGS), which scores inflammation from the initial view of the gallbladder on a 1-5 scale. We investigated the impact of PGS on intra-operative outcomes, including laparoscopic duration, attainment of the critical view of safety (CVS), and gallbladder injury. We additionally trained an artificial intelligence (AI) model to identify PGS. METHODS: One surgeon labeled surgical phases, PGS, CVS attainment, and gallbladder injury in 200 cholecystectomy videos. We used multilevel Bayesian regression models to analyze the PGS's effect on intra-operative outcomes. We trained AI models to identify PGS from an initial view of the gallbladder and compared model performance to annotations by a second surgeon. RESULTS: Slightly inflamed gallbladders (PGS-2) minimally increased duration, adding 2.7 [95% compatibility interval (CI) 0.3-7.0] minutes to an operation. This contrasted with maximally inflamed gallbladders (PGS-5), where on average 16.9 (95% CI 4.4-33.9) minutes were added, with 31.3 (95% CI 8.0-67.5) minutes added for the most affected surgeon. Inadvertent gallbladder injury occurred in 25% of cases, with a minimal increase in gallbladder injury observed with added inflammation. However, up to a 28% (95% CI - 2, 63) increase in probability of a gallbladder hole during PGS-5 cases was observed for some surgeons. Inflammation had no substantial effect on whether or not a surgeon attained the CVS. An AI model could reliably (Krippendorff's α = 0.71, 95% CI 0.65-0.77) quantify inflammation when compared to a second surgeon (α = 0.82, 95% CI 0.75-0.87). CONCLUSIONS: An AI model can identify the degree of gallbladder inflammation, which is predictive of cholecystectomy intra-operative course. This automated assessment could be useful for operating room workflow optimization and for targeted per-surgeon and per-resident feedback to accelerate acquisition of operative skills.


Assuntos
Colecistectomia Laparoscópica , Colecistite , Doenças da Vesícula Biliar , Inteligência Artificial , Teorema de Bayes , Colecistectomia , Colecistectomia Laparoscópica/efeitos adversos , Colecistite/cirurgia , Vesícula Biliar/patologia , Vesícula Biliar/cirurgia , Doenças da Vesícula Biliar/patologia , Doenças da Vesícula Biliar/cirurgia , Humanos , Inflamação/etiologia , Inflamação/patologia
3.
Surg Endosc ; 35(7): 4008-4015, 2021 07.
Artigo em Inglês | MEDLINE | ID: mdl-32720177

RESUMO

BACKGROUND: Artificial intelligence (AI) and computer vision (CV) have revolutionized image analysis. In surgery, CV applications have focused on surgical phase identification in laparoscopic videos. We proposed to apply CV techniques to identify phases in an endoscopic procedure, peroral endoscopic myotomy (POEM). METHODS: POEM videos were collected from Massachusetts General and Showa University Koto Toyosu Hospitals. Videos were labeled by surgeons with the following ground truth phases: (1) Submucosal injection, (2) Mucosotomy, (3) Submucosal tunnel, (4) Myotomy, and (5) Mucosotomy closure. The deep-learning CV model-Convolutional Neural Network (CNN) plus Long Short-Term Memory (LSTM)-was trained on 30 videos to create POEMNet. We then used POEMNet to identify operative phases in the remaining 20 videos. The model's performance was compared to surgeon annotated ground truth. RESULTS: POEMNet's overall phase identification accuracy was 87.6% (95% CI 87.4-87.9%). When evaluated on a per-phase basis, the model performed well, with mean unweighted and prevalence-weighted F1 scores of 0.766 and 0.875, respectively. The model performed best with longer phases, with 70.6% accuracy for phases that had a duration under 5 min and 88.3% accuracy for longer phases. DISCUSSION: A deep-learning-based approach to CV, previously successful in laparoscopic video phase identification, translates well to endoscopic procedures. With continued refinements, AI could contribute to intra-operative decision-support systems and post-operative risk prediction.


Assuntos
Acalasia Esofágica , Laparoscopia , Miotomia , Cirurgia Endoscópica por Orifício Natural , Inteligência Artificial , Acalasia Esofágica/cirurgia , Humanos , Redes Neurais de Computação
4.
IEEE Trans Med Imaging ; 43(1): 264-274, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-37498757

RESUMO

Analysis of relations between objects and comprehension of abstract concepts in the surgical video is important in AI-augmented surgery. However, building models that integrate our knowledge and understanding of surgery remains a challenging endeavor. In this paper, we propose a novel way to integrate conceptual knowledge into temporal analysis tasks using temporal concept graph networks. In the proposed networks, a knowledge graph is incorporated into the temporal video analysis of surgical notions, learning the meaning of concepts and relations as they apply to the data. We demonstrate results in surgical video data for tasks such as verification of the critical view of safety, estimation of the Parkland grading scale as well as recognizing instrument-action-tissue triplets. The results show that our method improves the recognition and detection of complex benchmarks as well as enables other analytic applications of interest.


Assuntos
Redes Neurais de Computação , Procedimentos Cirúrgicos Operatórios , Gravação em Vídeo
5.
IEEE Trans Pattern Anal Mach Intell ; 45(6): 7820-7835, 2023 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-36441894

RESUMO

Transformers have proven superior performance for a wide variety of tasks since they were introduced. In recent years, they have drawn attention from the vision community in tasks such as image classification and object detection. Despite this wave, an accurate and efficient multiple-object tracking (MOT) method based on transformers is yet to be designed. We argue that the direct application of a transformer architecture with quadratic complexity and insufficient noise-initialized sparse queries - is not optimal for MOT. We propose TransCenter, a transformer-based MOT architecture with dense representations for accurately tracking all the objects while keeping a reasonable runtime. Methodologically, we propose the use of image-related dense detection queries and efficient sparse tracking queries produced by our carefully designed query learning networks (QLN). On one hand, the dense image-related detection queries allow us to infer targets' locations globally and robustly through dense heatmap outputs. On the other hand, the set of sparse tracking queries efficiently interacts with image features in our TransCenter Decoder to associate object positions through time. As a result, TransCenterexhibits remarkable performance improvements and outperforms by a large margin the current state-of-the-art methods in two standard MOT benchmarks with two tracking settings (public/private). TransCenter is also proven efficient and accurate by an extensive ablation study and, comparisons to more naive alternatives and concurrent works. The code is made publicly available at https://github.com/yihongxu/transcenter.

6.
IEEE Trans Pattern Anal Mach Intell ; 43(5): 1761-1776, 2021 05.
Artigo em Inglês | MEDLINE | ID: mdl-31751223

RESUMO

In this article, we address the problem of tracking multiple speakers via the fusion of visual and auditory information. We propose to exploit the complementary nature and roles of these two modalities in order to accurately estimate smooth trajectories of the tracked persons, to deal with the partial or total absence of one of the modalities over short periods of time, and to estimate the acoustic status-either speaking or silent-of each tracked person over time. We propose to cast the problem at hand into a generative audio-visual fusion (or association) model formulated as a latent-variable temporal graphical model. This may well be viewed as the problem of maximizing the posterior joint distribution of a set of continuous and discrete latent variables given the past and current observations, which is intractable. We propose a variational inference model which amounts to approximate the joint distribution with a factorized distribution. The solution takes the form of a closed-form expectation maximization procedure. We describe in detail the inference algorithm, we evaluate its performance and we compare it with several baseline methods. These experiments show that the proposed audio-visual tracker performs well in informal meetings involving a time-varying number of people.

7.
Comput Assist Surg (Abingdon) ; 26(1): 58-68, 2021 12.
Artigo em Inglês | MEDLINE | ID: mdl-34126014

RESUMO

Annotation of surgical video is important for establishing ground truth in surgical data science endeavors that involve computer vision. With the growth of the field over the last decade, several challenges have been identified in annotating spatial, temporal, and clinical elements of surgical video as well as challenges in selecting annotators. In reviewing current challenges, we provide suggestions on opportunities for improvement and possible next steps to enable translation of surgical data science efforts in surgical video analysis to clinical research and practice.

8.
Surgery ; 169(5): 1253-1256, 2021 05.
Artigo em Inglês | MEDLINE | ID: mdl-33272610

RESUMO

The fields of computer vision (CV) and artificial intelligence (AI) have undergone rapid advancements in the past decade, many of which have been applied to the analysis of intraoperative video. These advances are driven by wide-spread application of deep learning, which leverages multiple layers of neural networks to teach computers complex tasks. Prior to these advances, applications of AI in the operating room were limited by our relative inability to train computers to accurately understand images with traditional machine learning (ML) techniques. The development and refining of deep neural networks that can now accurately identify objects in images and remember past surgical events has sparked a surge in the applications of CV to analyze intraoperative video and has allowed for the accurate identification of surgical phases (steps) and instruments across a variety of procedures. In some cases, CV can even identify operative phases with accuracy similar to surgeons. Future research will likely expand on this foundation of surgical knowledge using larger video datasets and improved algorithms with greater accuracy and interpretability to create clinically useful AI models that gain widespread adoption and augment the surgeon's ability to provide safer care for patients everywhere.


Assuntos
Inteligência Artificial , Cirurgia Geral
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA