Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Mais filtros

Base de dados
Tipo de documento
País de afiliação
Intervalo de ano de publicação
1.
Artigo em Inglês | MEDLINE | ID: mdl-38805333

RESUMO

Deep learning has been used across a large number of computer vision tasks, however designing the network architectures for each task is time consuming. Neural Architecture Search (NAS) promises to automatically build neural networks, optimised for the given task and dataset. However, most NAS methods are constrained to a specific macro-architecture design which makes it hard to apply to different tasks (classification, detection, segmentation). Following the work in Differentiable NAS (DNAS), we present a simple and efficient NAS method, Differentiable Parallel Operation (DIPO), that constructs a local search space in the form of a DIPO block, and can easily be applied to any convolutional network by injecting it in-place of the convolutions. The DIPO block's internal architecture and parameters are automatically optimised end-to-end for each task. We demonstrate the flexibility of our approach by applying DIPO to 4 model architectures (U-Net, HRNET, KAPAO and YOLOX) across different surgical tasks (surgical scene segmentation, surgical instrument detection, and surgical instrument pose estimation) and evaluated across 5 datasets. Results show significant improvements in surgical scene segmentation (+10.5% in CholecSeg8K, +13.2% in CaDIS), instrument detection (+1.5% in ROBUST-MIS, +5.3% in RoboKP), and instrument pose estimation (+9.8% in RoboKP).

2.
Int J Comput Assist Radiol Surg ; 19(2): 375-382, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-37347345

RESUMO

PURPOSE: Semantic segmentation in surgical videos has applications in intra-operative guidance, post-operative analytics and surgical education. Models need to provide accurate predictions since temporally inconsistent identification of anatomy can hinder patient safety. We propose a novel architecture for modelling temporal relationships in videos to address these issues. METHODS: We developed a temporal segmentation model that includes a static encoder and a spatio-temporal decoder. The encoder processes individual frames whilst the decoder learns spatio-temporal relationships from frame sequences. The decoder can be used with any suitable encoder to improve temporal consistency. RESULTS: Model performance was evaluated on the CholecSeg8k dataset and a private dataset of robotic Partial Nephrectomy procedures. Mean Intersection over Union improved by 1.30% and 4.27% respectively for each dataset when the temporal decoder was applied. Our model also displayed improvements in temporal consistency up to 7.23%. CONCLUSIONS: This work demonstrates an advance in video segmentation of surgical scenes with potential applications in surgery with a view to improve patient outcomes. The proposed decoder can extend state-of-the-art static models, and it is shown that it can improve per-frame segmentation output and video temporal consistency.


Assuntos
Robótica , Semântica , Humanos , Aprendizagem , Nefrectomia , Período Pós-Operatório
3.
Med Image Anal ; 86: 102803, 2023 05.
Artigo em Inglês | MEDLINE | ID: mdl-37004378

RESUMO

Context-aware decision support in the operating room can foster surgical safety and efficiency by leveraging real-time feedback from surgical workflow analysis. Most existing works recognize surgical activities at a coarse-grained level, such as phases, steps or events, leaving out fine-grained interaction details about the surgical activity; yet those are needed for more helpful AI assistance in the operating room. Recognizing surgical actions as triplets of combination delivers more comprehensive details about the activities taking place in surgical videos. This paper presents CholecTriplet2021: an endoscopic vision challenge organized at MICCAI 2021 for the recognition of surgical action triplets in laparoscopic videos. The challenge granted private access to the large-scale CholecT50 dataset, which is annotated with action triplet information. In this paper, we present the challenge setup and the assessment of the state-of-the-art deep learning methods proposed by the participants during the challenge. A total of 4 baseline methods from the challenge organizers and 19 new deep learning algorithms from the competing teams are presented to recognize surgical action triplets directly from surgical videos, achieving mean average precision (mAP) ranging from 4.2% to 38.1%. This study also analyzes the significance of the results obtained by the presented approaches, performs a thorough methodological comparison between them, in-depth result analysis, and proposes a novel ensemble method for enhanced recognition. Our analysis shows that surgical workflow analysis is not yet solved, and also highlights interesting directions for future research on fine-grained surgical activity recognition which is of utmost importance for the development of AI in surgery.


Assuntos
Benchmarking , Laparoscopia , Humanos , Algoritmos , Salas Cirúrgicas , Fluxo de Trabalho , Aprendizado Profundo
4.
Int J Comput Assist Radiol Surg ; 17(5): 953-960, 2022 May.
Artigo em Inglês | MEDLINE | ID: mdl-35505149

RESUMO

PURPOSE: Surgical workflow estimation techniques aim to divide a surgical video into temporal segments based on predefined surgical actions or objectives, which can be of different granularity such as steps or phases. Potential applications range from real-time intra-operative feedback to automatic post-operative reports and analysis. A common approach in the literature for performing automatic surgical phase estimation is to decouple the problem into two stages: feature extraction from a single frame and temporal feature fusion. This approach is performed in two stages due to computational restrictions when processing large spatio-temporal sequences. METHODS: The majority of existing works focus on pushing the performance solely through temporal model development. Differently, we follow a data-centric approach and propose a training pipeline that enables models to maximise the usage of existing datasets, which are generally used in isolation. Specifically, we use dense phase annotations available in Cholec80, and sparse scene (i.e., instrument and anatomy) segmentation annotation available in CholecSeg8k in less than 5% of the overlapping frames. We propose a simple multi-task encoder that effectively fuses both streams, when available, based on their importance and jointly optimise them for performing accurate phase prediction. RESULTS AND CONCLUSION: We show that with a small fraction of scene segmentation annotations, a relatively simple model can obtain comparable results than previous state-of-the-art and more complex architectures when evaluated in similar settings. We hope that this data-centric approach can encourage new research directions where data, and how to use it, plays an important role along with model development.


Assuntos
Procedimentos Cirúrgicos Robóticos , Humanos , Fluxo de Trabalho
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA