Búsqueda | Portal de Búsqueda de la BVS Ecuador

On the pitfalls of Batch Normalization for end-to-end video learning: A study on surgical workflow analysis.

Rivoir, Dominik; Funke, Isabel; Speidel, Stefanie.

Med Image Anal ; 94: 103126, 2024 May.

Artículo en Inglés | MEDLINE | ID: mdl-38452578

RESUMEN

Batch Normalization's (BN) unique property of depending on other samples in a batch is known to cause problems in several tasks, including sequence modeling. Yet, BN-related issues are hardly studied for long video understanding, despite the ubiquitous use of BN in CNNs (Convolutional Neural Networks) for feature extraction. Especially in surgical workflow analysis, where the lack of pretrained feature extractors has led to complex, multi-stage training pipelines, limited awareness of BN issues may have hidden the benefits of training CNNs and temporal models end to end. In this paper, we analyze pitfalls of BN in video learning, including issues specific to online tasks such as a 'cheating' effect in anticipation. We observe that BN's properties create major obstacles for end-to-end learning. However, using BN-free backbones, even simple CNN-LSTMs beat the state of the art on three surgical workflow benchmarks by utilizing adequate end-to-end training strategies which maximize temporal context. We conclude that awareness of BN's pitfalls is crucial for effective end-to-end learning in surgical tasks. By reproducing results on natural-video datasets, we hope our insights will benefit other areas of video learning as well. Code is available at: https://gitlab.com/nct_tso_public/pitfalls_bn.

Asunto(s)

Redes Neurales de la Computación , Humanos , Flujo de Trabajo

Exploring semantic consistency in unpaired image translation to generate data for surgical applications.

Venkatesh, Danush Kumar; Rivoir, Dominik; Pfeiffer, Micha; Kolbinger, Fiona; Distler, Marius; Weitz, Jürgen; Speidel, Stefanie.

Int J Comput Assist Radiol Surg ; 19(6): 985-993, 2024 Jun.

Artículo en Inglés | MEDLINE | ID: mdl-38407730

RESUMEN

PURPOSE: In surgical computer vision applications, data privacy and expert annotation challenges impede the acquisition of labeled training data. Unpaired image-to-image translation techniques have been explored to automatically generate annotated datasets by translating synthetic images into a realistic domain. The preservation of structure and semantic consistency, i.e., per-class distribution during translation, poses a significant challenge, particularly in cases of semantic distributional mismatch. METHOD: This study empirically investigates various translation methods for generating data in surgical applications, explicitly focusing on semantic consistency. Through our analysis, we introduce a novel and simple combination of effective approaches, which we call ConStructS. The defined losses within this approach operate on multiple image patches and spatial resolutions during translation. RESULTS: Various state-of-the-art models were extensively evaluated on two challenging surgical datasets. With two different evaluation schemes, the semantic consistency and the usefulness of the translated images on downstream semantic segmentation tasks were evaluated. The results demonstrate the effectiveness of the ConStructS method in minimizing semantic distortion, with images generated by this model showing superior utility for downstream training. CONCLUSION: In this study, we tackle semantic inconsistency in unpaired image translation for surgical applications with minimal labeled data. The simple model (ConStructS) enhances consistency during translation and serves as a practical way of generating fully labeled and semantically consistent datasets at minimal cost. Our code is available at https://gitlab.com/nct_tso_public/constructs .

Asunto(s)

Semántica , Humanos , Cirugía Asistida por Computador/métodos , Procesamiento de Imagen Asistido por Computador/métodos , Algoritmos

AIxSuture: vision-based assessment of open suturing skills.

Hoffmann, Hanna; Funke, Isabel; Peters, Philipp; Venkatesh, Danush Kumar; Egger, Jan; Rivoir, Dominik; Röhrig, Rainer; Hölzle, Frank; Bodenstedt, Sebastian; Willemer, Marie-Christin; Speidel, Stefanie; Puladi, Behrus.

Int J Comput Assist Radiol Surg ; 19(6): 1045-1052, 2024 Jun.

Artículo en Inglés | MEDLINE | ID: mdl-38526613

RESUMEN

PURPOSE: Efficient and precise surgical skills are essential in ensuring positive patient outcomes. By continuously providing real-time, data driven, and objective evaluation of surgical performance, automated skill assessment has the potential to greatly improve surgical skill training. Whereas machine learning-based surgical skill assessment is gaining traction for minimally invasive techniques, this cannot be said for open surgery skills. Open surgery generally has more degrees of freedom when compared to minimally invasive surgery, making it more difficult to interpret. In this paper, we present novel approaches for skill assessment for open surgery skills. METHODS: We analyzed a novel video dataset for open suturing training. We provide a detailed analysis of the dataset and define evaluation guidelines, using state of the art deep learning models. Furthermore, we present novel benchmarking results for surgical skill assessment in open suturing. The models are trained to classify a video into three skill levels based on the global rating score. To obtain initial results for video-based surgical skill classification, we benchmarked a temporal segment network with both an I3D and a Video Swin backbone on this dataset. RESULTS: The dataset is composed of 314 videos of approximately five minutes each. Model benchmarking results are an accuracy and F1 score of up to 75 and 72%, respectively. This is similar to the performance achieved by the individual raters, regarding inter-rater agreement and rater variability. We present the first end-to-end trained approach for skill assessment for open surgery training. CONCLUSION: We provide a thorough analysis of a new dataset as well as novel benchmarking results for surgical skill assessment. This opens the doors to new advances in skill assessment by enabling video-based skill assessment for classic surgical techniques with the potential to improve the surgical outcome of patients.

Asunto(s)

Competencia Clínica , Técnicas de Sutura , Grabación en Video , Humanos , Técnicas de Sutura/educación , Benchmarking

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA