RESUMO
Formalizing surgical activities as triplets of the used instruments, actions performed, and target anatomies is becoming a gold standard approach for surgical activity modeling. The benefit is that this formalization helps to obtain a more detailed understanding of tool-tissue interaction which can be used to develop better Artificial Intelligence assistance for image-guided surgery. Earlier efforts and the CholecTriplet challenge introduced in 2021 have put together techniques aimed at recognizing these triplets from surgical footage. Estimating also the spatial locations of the triplets would offer a more precise intraoperative context-aware decision support for computer-assisted intervention. This paper presents the CholecTriplet2022 challenge, which extends surgical action triplet modeling from recognition to detection. It includes weakly-supervised bounding box localization of every visible surgical instrument (or tool), as the key actors, and the modeling of each tool-activity in the form of triplet. The paper describes a baseline method and 10 new deep learning algorithms presented at the challenge to solve the task. It also provides thorough methodological comparisons of the methods, an in-depth analysis of the obtained results across multiple metrics, visual and procedural challenges; their significance, and useful insights for future research directions and applications in surgery.
Assuntos
Inteligência Artificial , Cirurgia Assistida por Computador , Humanos , Endoscopia , Algoritmos , Cirurgia Assistida por Computador/métodos , Instrumentos CirúrgicosRESUMO
PURPOSE: Photoacoustic tomography (PAT) is a novel imaging technique that can spatially resolve both morphological and functional tissue properties, such as vessel topology and tissue oxygenation. While this capacity makes PAT a promising modality for the diagnosis, treatment, and follow-up of various diseases, a current drawback is the limited field of view provided by the conventionally applied 2D probes. METHODS: In this paper, we present a novel approach to 3D reconstruction of PAT data (Tattoo tomography) that does not require an external tracking system and can smoothly be integrated into clinical workflows. It is based on an optical pattern placed on the region of interest prior to image acquisition. This pattern is designed in a way that a single tomographic image of it enables the recovery of the probe pose relative to the coordinate system of the pattern, which serves as a global coordinate system for image compounding. RESULTS: To investigate the feasibility of Tattoo tomography, we assessed the quality of 3D image reconstruction with experimental phantom data and in vivo forearm data. The results obtained with our prototype indicate that the Tattoo method enables the accurate and precise 3D reconstruction of PAT data and may be better suited for this task than the baseline method using optical tracking. CONCLUSIONS: In contrast to previous approaches to 3D ultrasound (US) or PAT reconstruction, the Tattoo approach neither requires complex external hardware nor training data acquired for a specific application. It could thus become a valuable tool for clinical freehand PAT.
Assuntos
Imageamento Tridimensional/métodos , Imagens de Fantasmas , Tatuagem/métodos , Tomografia Computadorizada por Raios X/métodos , Ultrassonografia/métodos , HumanosRESUMO
Image-based tracking of medical instruments is an integral part of surgical data science applications. Previous research has addressed the tasks of detecting, segmenting and tracking medical instruments based on laparoscopic video data. However, the proposed methods still tend to fail when applied to challenging images and do not generalize well to data they have not been trained on. This paper introduces the Heidelberg Colorectal (HeiCo) data set - the first publicly available data set enabling comprehensive benchmarking of medical instrument detection and segmentation algorithms with a specific emphasis on method robustness and generalization capabilities. Our data set comprises 30 laparoscopic videos and corresponding sensor data from medical devices in the operating room for three different types of laparoscopic surgery. Annotations include surgical phase labels for all video frames as well as information on instrument presence and corresponding instance-wise segmentation masks for surgical instruments (if any) in more than 10,000 individual frames. The data has successfully been used to organize international competitions within the Endoscopic Vision Challenges 2017 and 2019.