Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 13 de 13
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Artigo em Inglês | MEDLINE | ID: mdl-38805333

RESUMO

Deep learning has been used across a large number of computer vision tasks, however designing the network architectures for each task is time consuming. Neural Architecture Search (NAS) promises to automatically build neural networks, optimised for the given task and dataset. However, most NAS methods are constrained to a specific macro-architecture design which makes it hard to apply to different tasks (classification, detection, segmentation). Following the work in Differentiable NAS (DNAS), we present a simple and efficient NAS method, Differentiable Parallel Operation (DIPO), that constructs a local search space in the form of a DIPO block, and can easily be applied to any convolutional network by injecting it in-place of the convolutions. The DIPO block's internal architecture and parameters are automatically optimised end-to-end for each task. We demonstrate the flexibility of our approach by applying DIPO to 4 model architectures (U-Net, HRNET, KAPAO and YOLOX) across different surgical tasks (surgical scene segmentation, surgical instrument detection, and surgical instrument pose estimation) and evaluated across 5 datasets. Results show significant improvements in surgical scene segmentation (+10.5% in CholecSeg8K, +13.2% in CaDIS), instrument detection (+1.5% in ROBUST-MIS, +5.3% in RoboKP), and instrument pose estimation (+9.8% in RoboKP).

2.
Int J Comput Assist Radiol Surg ; 19(2): 375-382, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-37347345

RESUMO

PURPOSE: Semantic segmentation in surgical videos has applications in intra-operative guidance, post-operative analytics and surgical education. Models need to provide accurate predictions since temporally inconsistent identification of anatomy can hinder patient safety. We propose a novel architecture for modelling temporal relationships in videos to address these issues. METHODS: We developed a temporal segmentation model that includes a static encoder and a spatio-temporal decoder. The encoder processes individual frames whilst the decoder learns spatio-temporal relationships from frame sequences. The decoder can be used with any suitable encoder to improve temporal consistency. RESULTS: Model performance was evaluated on the CholecSeg8k dataset and a private dataset of robotic Partial Nephrectomy procedures. Mean Intersection over Union improved by 1.30% and 4.27% respectively for each dataset when the temporal decoder was applied. Our model also displayed improvements in temporal consistency up to 7.23%. CONCLUSIONS: This work demonstrates an advance in video segmentation of surgical scenes with potential applications in surgery with a view to improve patient outcomes. The proposed decoder can extend state-of-the-art static models, and it is shown that it can improve per-frame segmentation output and video temporal consistency.


Assuntos
Robótica , Semântica , Humanos , Aprendizagem , Nefrectomia , Período Pós-Operatório
3.
Int J Comput Assist Radiol Surg ; 19(1): 61-68, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-37340283

RESUMO

PURPOSE: Advances in surgical phase recognition are generally led by training deeper networks. Rather than going further with a more complex solution, we believe that current models can be exploited better. We propose a self-knowledge distillation framework that can be integrated into current state-of-the-art (SOTA) models without requiring any extra complexity to the models or annotations. METHODS: Knowledge distillation is a framework for network regularization where knowledge is distilled from a teacher network to a student network. In self-knowledge distillation, the student model becomes the teacher such that the network learns from itself. Most phase recognition models follow an encoder-decoder framework. Our framework utilizes self-knowledge distillation in both stages. The teacher model guides the training process of the student model to extract enhanced feature representations from the encoder and build a more robust temporal decoder to tackle the over-segmentation problem. RESULTS: We validate our proposed framework on the public dataset Cholec80. Our framework is embedded on top of four popular SOTA approaches and consistently improves their performance. Specifically, our best GRU model boosts performance by [Formula: see text] accuracy and [Formula: see text] F1-score over the same baseline model. CONCLUSION: We embed a self-knowledge distillation framework for the first time in the surgical phase recognition training pipeline. Experimental results demonstrate that our simple yet powerful framework can improve performance of existing phase recognition models. Moreover, our extensive experiments show that even with 75% of the training set we still achieve performance on par with the same baseline model trained on the full set.


Assuntos
Aprendizagem , Estudantes , Humanos
4.
Med Image Anal ; 86: 102803, 2023 05.
Artigo em Inglês | MEDLINE | ID: mdl-37004378

RESUMO

Context-aware decision support in the operating room can foster surgical safety and efficiency by leveraging real-time feedback from surgical workflow analysis. Most existing works recognize surgical activities at a coarse-grained level, such as phases, steps or events, leaving out fine-grained interaction details about the surgical activity; yet those are needed for more helpful AI assistance in the operating room. Recognizing surgical actions as triplets of combination delivers more comprehensive details about the activities taking place in surgical videos. This paper presents CholecTriplet2021: an endoscopic vision challenge organized at MICCAI 2021 for the recognition of surgical action triplets in laparoscopic videos. The challenge granted private access to the large-scale CholecT50 dataset, which is annotated with action triplet information. In this paper, we present the challenge setup and the assessment of the state-of-the-art deep learning methods proposed by the participants during the challenge. A total of 4 baseline methods from the challenge organizers and 19 new deep learning algorithms from the competing teams are presented to recognize surgical action triplets directly from surgical videos, achieving mean average precision (mAP) ranging from 4.2% to 38.1%. This study also analyzes the significance of the results obtained by the presented approaches, performs a thorough methodological comparison between them, in-depth result analysis, and proposes a novel ensemble method for enhanced recognition. Our analysis shows that surgical workflow analysis is not yet solved, and also highlights interesting directions for future research on fine-grained surgical activity recognition which is of utmost importance for the development of AI in surgery.


Assuntos
Benchmarking , Laparoscopia , Humanos , Algoritmos , Salas Cirúrgicas , Fluxo de Trabalho , Aprendizado Profundo
5.
Int J Comput Assist Radiol Surg ; 17(12): 2173-2181, 2022 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-36272018

RESUMO

PURPOSE: Bile duct injury is a significant problem in laparoscopic cholecystectomy and can have grave consequences for patient outcomes. Automatic identification of the critical structures (cystic duct and cystic artery) could potentially reduce complications during surgery by helping the surgeon establish Critical View of Safety, or eventually may even provide real time intra-operative guidance. METHODS: A computer vision model was trained to identify the critical structures. Label relaxation enabled the model to cope with ambiguous spatial extent and high annotation variability. Pseudo-label self-supervision allowed the model to use unlabelled data, which can be particularly beneficial when scarce labelled data is available for training. Intrinsic variability in annotations was assessed across several annotators, quantifying the extent of annotation ambiguity and setting a baseline for model accuracy. RESULTS: Using 3050 labelled and 3682 unlabelled cholecystectomy frames, the model achieved an IoU of 65% and presence detection F1 score of 75%. Inter-annotator IoU agreement was 70%, demonstrating the model was near human-level agreement on average in this dataset. The model's outputs were validated by three expert surgeons, who confirmed that its outputs were accurate and promising for future usage. CONCLUSION: Identification of critical structures can achieve high accuracy, and is a promising step towards computer-assisted intervention in addition to potential applications in analytics and education. High accuracy and surgeon approval is maintained when detecting the structures separately as distinct classes. Future work will focus on guaranteeing safe identification of critical anatomy, including the bile duct, and validating the performance of automated approaches.


Assuntos
Colecistectomia Laparoscópica , Humanos , Ductos Biliares/lesões , Colecistectomia , Artéria Hepática
6.
Int J Comput Assist Radiol Surg ; 17(5): 953-960, 2022 May.
Artigo em Inglês | MEDLINE | ID: mdl-35505149

RESUMO

PURPOSE: Surgical workflow estimation techniques aim to divide a surgical video into temporal segments based on predefined surgical actions or objectives, which can be of different granularity such as steps or phases. Potential applications range from real-time intra-operative feedback to automatic post-operative reports and analysis. A common approach in the literature for performing automatic surgical phase estimation is to decouple the problem into two stages: feature extraction from a single frame and temporal feature fusion. This approach is performed in two stages due to computational restrictions when processing large spatio-temporal sequences. METHODS: The majority of existing works focus on pushing the performance solely through temporal model development. Differently, we follow a data-centric approach and propose a training pipeline that enables models to maximise the usage of existing datasets, which are generally used in isolation. Specifically, we use dense phase annotations available in Cholec80, and sparse scene (i.e., instrument and anatomy) segmentation annotation available in CholecSeg8k in less than 5% of the overlapping frames. We propose a simple multi-task encoder that effectively fuses both streams, when available, based on their importance and jointly optimise them for performing accurate phase prediction. RESULTS AND CONCLUSION: We show that with a small fraction of scene segmentation annotations, a relatively simple model can obtain comparable results than previous state-of-the-art and more complex architectures when evaluated in similar settings. We hope that this data-centric approach can encourage new research directions where data, and how to use it, plays an important role along with model development.


Assuntos
Procedimentos Cirúrgicos Robóticos , Humanos , Fluxo de Trabalho
7.
Front Cell Dev Biol ; 10: 842342, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35433703

RESUMO

As sample preparation and imaging techniques have expanded and improved to include a variety of options for larger sized and numbers of samples, the bottleneck in volumetric imaging is now data analysis. Annotation and segmentation are both common, yet difficult, data analysis tasks which are required to bring meaning to the volumetric data. The SuRVoS application has been updated and redesigned to provide access to both manual and machine learning-based segmentation and annotation techniques, including support for crowd sourced data. Combining adjacent, similar voxels (supervoxels) provides a mechanism for speeding up segmentation both in the painting of annotation and by training a segmentation model on a small amount of annotation. The support for layers allows multiple datasets to be viewed and annotated together which, for example, enables the use of correlative data (e.g. crowd-sourced annotations or secondary imaging techniques) to guide segmentation. The ability to work with larger data on high-performance servers with GPUs has been added through a client-server architecture and the Pytorch-based image processing and segmentation server is flexible and extensible, and allows the implementation of deep learning-based segmentation modules. The client side has been built around Napari allowing integration of SuRVoS into an ecosystem for open-source image analysis while the server side has been built with cloud computing and extensibility through plugins in mind. Together these improvements to SuRVoS provide a platform for accelerating the annotation and segmentation of volumetric and correlative imaging data across modalities and scales.

8.
Int J Comput Assist Radiol Surg ; 17(5): 849-856, 2022 May.
Artigo em Inglês | MEDLINE | ID: mdl-35353299

RESUMO

PURPOSE: We tackle the problem of online surgical phase recognition in laparoscopic procedures, which is key in developing context-aware supporting systems. We propose a novel approach to take temporal context in surgical videos into account by precise modeling of temporal neighborhoods. METHODS: We propose a two-stage model to perform phase recognition. A CNN model is used as a feature extractor to project RGB frames into a high-dimensional feature space. We introduce a novel paradigm for surgical phase recognition which utilizes graph neural networks to incorporate temporal information. Unlike recurrent neural networks and temporal convolution networks, our graph-based approach offers a more generic and flexible way for modeling temporal relationships. Each frame is a node in the graph, and the edges in the graph are used to define temporal connections among the nodes. The flexible configuration of temporal neighborhood comes at the price of losing temporal order. To mitigate this, our approach takes temporal orders into account by encoding frame positions, which is important to reliably predict surgical phases. RESULTS: Experiments are carried out on the public Cholec80 dataset that contains 80 annotated videos. The experimental results highlight the superior performance of the proposed approach compared to the state-of-the-art models on this dataset. CONCLUSION: A novel approach for formulating video-based surgical phase recognition is presented. The results indicate that temporal information can be incorporated using graph-based models, and positional encoding is important to efficiently utilize temporal information. Graph networks open possibilities to use evidence theory for uncertainty analysis in surgical phase recognition.


Assuntos
Laparoscopia , Redes Neurais de Computação , Humanos , Laparoscopia/métodos , Fluxo de Trabalho
9.
J Neurosurg ; : 1-8, 2021 Nov 05.
Artigo em Inglês | MEDLINE | ID: mdl-34740198

RESUMO

OBJECTIVE: Surgical workflow analysis involves systematically breaking down operations into key phases and steps. Automatic analysis of this workflow has potential uses for surgical training, preoperative planning, and outcome prediction. Recent advances in machine learning (ML) and computer vision have allowed accurate automated workflow analysis of operative videos. In this Idea, Development, Exploration, Assessment, Long-term study (IDEAL) stage 0 study, the authors sought to use Touch Surgery for the development and validation of an ML-powered analysis of phases and steps in the endoscopic transsphenoidal approach (eTSA) for pituitary adenoma resection, a first for neurosurgery. METHODS: The surgical phases and steps of 50 anonymized eTSA operative videos were labeled by expert surgeons. Forty videos were used to train a combined convolutional and recurrent neural network model by Touch Surgery. Ten videos were used for model evaluation (accuracy, F1 score), comparing the phase and step recognition of surgeons to the automatic detection of the ML model. RESULTS: The longest phase was the sellar phase (median 28 minutes), followed by the nasal phase (median 22 minutes) and the closure phase (median 14 minutes). The longest steps were step 5 (tumor identification and excision, median 17 minutes); step 3 (posterior septectomy and removal of sphenoid septations, median 14 minutes); and step 4 (anterior sellar wall removal, median 10 minutes). There were substantial variations within the recorded procedures in terms of video appearances, step duration, and step order, with only 50% of videos containing all 7 steps performed sequentially in numerical order. Despite this, the model was able to output accurate recognition of surgical phases (91% accuracy, 90% F1 score) and steps (76% accuracy, 75% F1 score). CONCLUSIONS: In this IDEAL stage 0 study, ML techniques have been developed to automatically analyze operative videos of eTSA pituitary surgery. This technology has previously been shown to be acceptable to neurosurgical teams and patients. ML-based surgical workflow analysis has numerous potential uses-such as education (e.g., automatic indexing of contemporary operative videos for teaching), improved operative efficiency (e.g., orchestrating the entire surgical team to a common workflow), and improved patient outcomes (e.g., comparison of surgical techniques or early detection of adverse events). Future directions include the real-time integration of Touch Surgery into the live operative environment as an IDEAL stage 1 (first-in-human) study, and further development of underpinning ML models using larger data sets.

10.
Med Image Anal ; 71: 102053, 2021 07.
Artigo em Inglês | MEDLINE | ID: mdl-33864969

RESUMO

Video feedback provides a wealth of information about surgical procedures and is the main sensory cue for surgeons. Scene understanding is crucial to computer assisted interventions (CAI) and to post-operative analysis of the surgical procedure. A fundamental building block of such capabilities is the identification and localization of surgical instruments and anatomical structures through semantic segmentation. Deep learning has advanced semantic segmentation techniques in the recent years but is inherently reliant on the availability of labelled datasets for model training. This paper introduces a dataset for semantic segmentation of cataract surgery videos complementing the publicly available CATARACTS challenge dataset. In addition, we benchmark the performance of several state-of-the-art deep learning models for semantic segmentation on the presented dataset. The dataset is publicly available at https://cataracts-semantic-segmentation2020.grand-challenge.org/.


Assuntos
Extração de Catarata , Catarata , Catarata/diagnóstico por imagem , Humanos , Processamento de Imagem Assistida por Computador , Semântica , Instrumentos Cirúrgicos
11.
Int J Comput Assist Radiol Surg ; 14(7): 1247-1257, 2019 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-31165349

RESUMO

PURPOSE: We present a different approach for annotating laparoscopic images for segmentation in a weak fashion and experimentally prove that its accuracy when trained with partial cross-entropy is close to that obtained with fully supervised approaches. METHODS: We propose an approach that relies on weak annotations provided as stripes over the different objects in the image and partial cross-entropy as the loss function of a fully convolutional neural network to obtain a dense pixel-level prediction map. RESULTS: We validate our method on three different datasets, providing qualitative results for all of them and quantitative results for two of them. The experiments show that our approach is able to obtain at least [Formula: see text] of the accuracy obtained with fully supervised methods for all the tested datasets, while requiring [Formula: see text][Formula: see text] less time to create the annotations compared to full supervision. CONCLUSIONS: With this work, we demonstrate that laparoscopic data can be segmented using very few annotated data while maintaining levels of accuracy comparable to those obtained with full supervision.


Assuntos
Laparoscopia/métodos , Instrumentos Cirúrgicos , Humanos , Redes Neurais de Computação
12.
J Vis Exp ; (126)2017 08 23.
Artigo em Inglês | MEDLINE | ID: mdl-28872144

RESUMO

Segmentation is the process of isolating specific regions or objects within an imaged volume, so that further study can be undertaken on these areas of interest. When considering the analysis of complex biological systems, the segmentation of three-dimensional image data is a time consuming and labor intensive step. With the increased availability of many imaging modalities and with automated data collection schemes, this poses an increased challenge for the modern experimental biologist to move from data to knowledge. This publication describes the use of SuRVoS Workbench, a program designed to address these issues by providing methods to semi-automatically segment complex biological volumetric data. Three datasets of differing magnification and imaging modalities are presented here, each highlighting different strategies of segmenting with SuRVoS. Phase contrast X-ray tomography (microCT) of the fruiting body of a plant is used to demonstrate segmentation using model training, cryo electron tomography (cryoET) of human platelets is used to demonstrate segmentation using super- and megavoxels, and cryo soft X-ray tomography (cryoSXT) of a mammalian cell line is used to demonstrate the label splitting tools. Strategies and parameters for each datatype are also presented. By blending a selection of semi-automatic processes into a single interactive tool, SuRVoS provides several benefits. Overall time to segment volumetric data is reduced by a factor of five when compared to manual segmentation, a mainstay in many image processing fields. This is a significant savings when full manual segmentation can take weeks of effort. Additionally, subjectivity is addressed through the use of computationally identified boundaries, and splitting complex collections of objects by their calculated properties rather than on a case-by-case basis.


Assuntos
Processamento de Imagem Assistida por Computador/métodos , Tomografia Computadorizada por Raios X/métodos , Humanos
13.
J Struct Biol ; 198(1): 43-53, 2017 04.
Artigo em Inglês | MEDLINE | ID: mdl-28246039

RESUMO

Segmentation of biological volumes is a crucial step needed to fully analyse their scientific content. Not having access to convenient tools with which to segment or annotate the data means many biological volumes remain under-utilised. Automatic segmentation of biological volumes is still a very challenging research field, and current methods usually require a large amount of manually-produced training data to deliver a high-quality segmentation. However, the complex appearance of cellular features and the high variance from one sample to another, along with the time-consuming work of manually labelling complete volumes, makes the required training data very scarce or non-existent. Thus, fully automatic approaches are often infeasible for many practical applications. With the aim of unifying the segmentation power of automatic approaches with the user expertise and ability to manually annotate biological samples, we present a new workbench named SuRVoS (Super-Region Volume Segmentation). Within this software, a volume to be segmented is first partitioned into hierarchical segmentation layers (named Super-Regions) and is then interactively segmented with the user's knowledge input in the form of training annotations. SuRVoS first learns from and then extends user inputs to the rest of the volume, while using Super-Regions for quicker and easier segmentation than when using a voxel grid. These benefits are especially noticeable on noisy, low-dose, biological datasets.


Assuntos
Conjuntos de Dados como Assunto , Software , Algoritmos , Curadoria de Dados/métodos , Aprendizado de Máquina
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...