Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 13 de 13
Filtrar
Mais filtros











Base de dados
Intervalo de ano de publicação
1.
Med Image Anal ; 96: 103195, 2024 May 15.
Artigo em Inglês | MEDLINE | ID: mdl-38815359

RESUMO

Colorectal cancer is one of the most common cancers in the world. While colonoscopy is an effective screening technique, navigating an endoscope through the colon to detect polyps is challenging. A 3D map of the observed surfaces could enhance the identification of unscreened colon tissue and serve as a training platform. However, reconstructing the colon from video footage remains difficult. Learning-based approaches hold promise as robust alternatives, but necessitate extensive datasets. Establishing a benchmark dataset, the 2022 EndoVis sub-challenge SimCol3D aimed to facilitate data-driven depth and pose prediction during colonoscopy. The challenge was hosted as part of MICCAI 2022 in Singapore. Six teams from around the world and representatives from academia and industry participated in the three sub-challenges: synthetic depth prediction, synthetic pose prediction, and real pose prediction. This paper describes the challenge, the submitted methods, and their results. We show that depth prediction from synthetic colonoscopy images is robustly solvable, while pose estimation remains an open research question.

2.
Artigo em Inglês | MEDLINE | ID: mdl-38758289

RESUMO

PURPOSE: The recent segment anything model (SAM) has demonstrated impressive performance with point, text or bounding box prompts, in various applications. However, in safety-critical surgical tasks, prompting is not possible due to (1) the lack of per-frame prompts for supervised learning, (2) it is unrealistic to prompt frame-by-frame in a real-time tracking application, and (3) it is expensive to annotate prompts for offline applications. METHODS: We develop Surgical-DeSAM to generate automatic bounding box prompts for decoupling SAM to obtain instrument segmentation in real-time robotic surgery. We utilise a commonly used detection architecture, DETR, and fine-tuned it to obtain bounding box prompt for the instruments. We then empolyed decoupling SAM (DeSAM) by replacing the image encoder with DETR encoder and fine-tune prompt encoder and mask decoder to obtain instance segmentation for the surgical instruments. To improve detection performance, we adopted the Swin-transformer to better feature representation. RESULTS: The proposed method has been validated on two publicly available datasets from the MICCAI surgical instruments segmentation challenge EndoVis 2017 and 2018. The performance of our method is also compared with SOTA instrument segmentation methods and demonstrated significant improvements with dice metrics of 89.62 and 90.70 for the EndoVis 2017 and 2018 CONCLUSION: Our extensive experiments and validations demonstrate that Surgical-DeSAM enables real-time instrument segmentation without any additional prompting and outperforms other SOTA segmentation methods.

3.
Artigo em Inglês | MEDLINE | ID: mdl-38528306

RESUMO

PURPOSE: Endoscopic pituitary surgery entails navigating through the nasal cavity and sphenoid sinus to access the sella using an endoscope. This procedure is intricate due to the proximity of crucial anatomical structures (e.g. carotid arteries and optic nerves) to pituitary tumours, and any unintended damage can lead to severe complications including blindness and death. Intraoperative guidance during this surgery could support improved localization of the critical structures leading to reducing the risk of complications. METHODS: A deep learning network PitSurgRT is proposed for real-time localization of critical structures in endoscopic pituitary surgery. The network uses high-resolution net (HRNet) as a backbone with a multi-head for jointly localizing critical anatomical structures while segmenting larger structures simultaneously. Moreover, the trained model is optimized and accelerated by using TensorRT. Finally, the model predictions are shown to neurosurgeons, to test their guidance capabilities. RESULTS: Compared with the state-of-the-art method, our model significantly reduces the mean error in landmark detection of the critical structures from 138.76 to 54.40 pixels in a 1280 × 720-pixel image. Furthermore, the semantic segmentation of the most critical structure, sella, is improved by 4.39% IoU. The inference speed of the accelerated model achieves 298 frames per second with floating-point-16 precision. In the study of 15 neurosurgeons, 88.67% of predictions are considered accurate enough for real-time guidance. CONCLUSION: The results from the quantitative evaluation, real-time acceleration, and neurosurgeon study demonstrate the proposed method is highly promising in providing real-time intraoperative guidance of the critical anatomical structures in endoscopic pituitary surgery.

4.
Artigo em Inglês | MEDLINE | ID: mdl-38459402

RESUMO

PURPOSE: Depth estimation in robotic surgery is vital in 3D reconstruction, surgical navigation and augmented reality visualization. Although the foundation model exhibits outstanding performance in many vision tasks, including depth estimation (e.g., DINOv2), recent works observed its limitations in medical and surgical domain-specific applications. This work presents a low-ranked adaptation (LoRA) of the foundation model for surgical depth estimation. METHODS: We design a foundation model-based depth estimation method, referred to as Surgical-DINO, a low-rank adaptation of the DINOv2 for depth estimation in endoscopic surgery. We build LoRA layers and integrate them into DINO to adapt with surgery-specific domain knowledge instead of conventional fine-tuning. During training, we freeze the DINO image encoder, which shows excellent visual representation capacity, and only optimize the LoRA layers and depth decoder to integrate features from the surgical scene. RESULTS: Our model is extensively validated on a MICCAI challenge dataset of SCARED, which is collected from da Vinci Xi endoscope surgery. We empirically show that Surgical-DINO significantly outperforms all the state-of-the-art models in endoscopic depth estimation tasks. The analysis with ablation studies has shown evidence of the remarkable effect of our LoRA layers and adaptation. CONCLUSION: Surgical-DINO shed some light on the successful adaptation of the foundation models into the surgical domain for depth estimation. There is clear evidence in the results that zero-shot prediction on pre-trained weights in computer vision datasets or naive fine-tuning is not sufficient to use the foundation model in the surgical domain directly.

5.
IEEE Trans Med Imaging ; 43(6): 2291-2302, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38381643

RESUMO

Deep Neural Networks (DNNs) based semantic segmentation of the robotic instruments and tissues can enhance the precision of surgical activities in robot-assisted surgery. However, in biological learning, DNNs cannot learn incremental tasks over time and exhibit catastrophic forgetting, which refers to the sharp decline in performance on previously learned tasks after learning a new one. Specifically, when data scarcity is the issue, the model shows a rapid drop in performance on previously learned instruments after learning new data with new instruments. The problem becomes worse when it limits releasing the dataset of the old instruments for the old model due to privacy concerns and the unavailability of the data for the new or updated version of the instruments for the continual learning model. For this purpose, we develop a privacy-preserving synthetic continual semantic segmentation framework by blending and harmonizing (i) open-source old instruments foreground to the synthesized background without revealing real patient data in public and (ii) new instruments foreground to extensively augmented real background. To boost the balanced logit distillation from the old model to the continual learning model, we design overlapping class-aware temperature normalization (CAT) by controlling model learning utility. We also introduce multi-scale shifted-feature distillation (SD) to maintain long and short-range spatial relationships among the semantic objects where conventional short-range spatial features with limited information reduce the power of feature distillation. We demonstrate the effectiveness of our framework on the EndoVis 2017 and 2018 instrument segmentation dataset with a generalized continual learning setting. Code is available at https://github.com/XuMengyaAmy/Synthetic_CAT_SD.


Assuntos
Aprendizado Profundo , Procedimentos Cirúrgicos Robóticos , Semântica , Procedimentos Cirúrgicos Robóticos/métodos , Humanos , Processamento de Imagem Assistida por Computador/métodos , Algoritmos , Redes Neurais de Computação , Privacidade
6.
Front Surg ; 10: 1222859, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37780914

RESUMO

Background: Endoscopic endonasal surgery is an established minimally invasive technique for resecting pituitary adenomas. However, understanding orientation and identifying critical neurovascular structures in this anatomically dense region can be challenging. In clinical practice, commercial navigation systems use a tracked pointer for guidance. Augmented Reality (AR) is an emerging technology used for surgical guidance. It can be tracker based or vision based, but neither is widely used in pituitary surgery. Methods: This pre-clinical study aims to assess the accuracy of tracker-based navigation systems, including those that allow for AR. Two setups were used to conduct simulations: (1) the standard pointer setup, tracked by an infrared camera; and (2) the endoscope setup that allows for AR, using reflective markers on the end of the endoscope, tracked by infrared cameras. The error sources were estimated by calculating the Euclidean distance between a point's true location and the point's location after passing it through the noisy system. A phantom study was then conducted to verify the in-silico simulation results and show a working example of image-based navigation errors in current methodologies. Results: The errors of the tracked pointer and tracked endoscope simulations were 1.7 and 2.5 mm respectively. The phantom study showed errors of 2.14 and 3.21 mm for the tracked pointer and tracked endoscope setups respectively. Discussion: In pituitary surgery, precise neighboring structure identification is crucial for success. However, our simulations reveal that the errors of tracked approaches were too large to meet the fine error margins required for pituitary surgery. In order to achieve the required accuracy, we would need much more accurate tracking, better calibration and improved registration techniques.

7.
Int J Comput Assist Radiol Surg ; 18(10): 1875-1883, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-36862365

RESUMO

PURPOSE: In curriculum learning, the idea is to train on easier samples first and gradually increase the difficulty, while in self-paced learning, a pacing function defines the speed to adapt the training progress. While both methods heavily rely on the ability to score the difficulty of data samples, an optimal scoring function is still under exploration. METHODOLOGY: Distillation is a knowledge transfer approach where a teacher network guides a student network by feeding a sequence of random samples. We argue that guiding student networks with an efficient curriculum strategy can improve model generalization and robustness. For this purpose, we design an uncertainty-based paced curriculum learning in self-distillation for medical image segmentation. We fuse the prediction uncertainty and annotation boundary uncertainty to develop a novel paced-curriculum distillation (P-CD). We utilize the teacher model to obtain prediction uncertainty and spatially varying label smoothing with Gaussian kernel to generate segmentation boundary uncertainty from the annotation. We also investigate the robustness of our method by applying various types and severity of image perturbation and corruption. RESULTS: The proposed technique is validated on two medical datasets of breast ultrasound image segmentation and robot-assisted surgical scene segmentation and achieved significantly better performance in terms of segmentation and robustness. CONCLUSION: P-CD improves the performance and obtains better generalization and robustness over the dataset shift. While curriculum learning requires extensive tuning of hyper-parameters for pacing function, the level of performance improvement suppresses this limitation.


Assuntos
Currículo , Destilação , Humanos , Incerteza , Aprendizagem , Algoritmos , Processamento de Imagem Assistida por Computador
8.
Int J Comput Assist Radiol Surg ; 18(5): 921-928, 2023 May.
Artigo em Inglês | MEDLINE | ID: mdl-36648701

RESUMO

PURPOSE: Surgery scene understanding with tool-tissue interaction recognition and automatic report generation can play an important role in intra-operative guidance, decision-making and postoperative analysis in robotic surgery. However, domain shifts between different surgeries with inter and intra-patient variation and novel instruments' appearance degrade the performance of model prediction. Moreover, it requires output from multiple models, which can be computationally expensive and affect real-time performance. METHODOLOGY: A multi-task learning (MTL) model is proposed for surgical report generation and tool-tissue interaction prediction that deals with domain shift problems. The model forms of shared feature extractor, mesh-transformer branch for captioning and graph attention branch for tool-tissue interaction prediction. The shared feature extractor employs class incremental contrastive learning to tackle intensity shift and novel class appearance in the target domain. We design Laplacian of Gaussian-based curriculum learning into both shared and task-specific branches to enhance model learning. We incorporate a task-aware asynchronous MTL optimization technique to fine-tune the shared weights and converge both tasks optimally. RESULTS: The proposed MTL model trained using task-aware optimization and fine-tuning techniques reported a balanced performance (BLEU score of 0.4049 for scene captioning and accuracy of 0.3508 for interaction detection) for both tasks on the target domain and performed on-par with single-task models in domain adaptation. CONCLUSION: The proposed multi-task model was able to adapt to domain shifts, incorporate novel instruments in the target domain, and perform tool-tissue interaction detection and report generation on par with single-task models.


Assuntos
Aprendizagem , Procedimentos Cirúrgicos Robóticos , Humanos , Currículo , Distribuição Normal , Período Pós-Operatório
9.
Biomimetics (Basel) ; 7(2)2022 May 28.
Artigo em Inglês | MEDLINE | ID: mdl-35735584

RESUMO

Surgical scene understanding is a key barrier for situation-aware robotic surgeries and the associated surgical training. With the presence of domain shifts and the inclusion of new instruments and tissues, learning domain generalization (DG) plays a pivotal role in expanding instrument-tissue interaction detection to new domains in robotic surgery. Mimicking the ability of humans to incrementally learn new skills without forgetting their old skills in a similar domain, we employ incremental DG on scene graphs to predict instrument-tissue interaction during robot-assisted surgery. To achieve incremental DG, incorporate incremental learning (IL) to accommodate new instruments and knowledge-distillation-based student-teacher learning to tackle domain shifts in the new domain. Additionally, we designed an enhanced curriculum by smoothing (E-CBS) based on Laplacian of Gaussian (LoG) and Gaussian kernels, and integrated it with the feature extraction network (FEN) and graph network to improve the instrument-tissue interaction performance. Furthermore, the FEN's and graph network's logits are normalized by temperature normalization (T-Norm), and its effect in model calibration was studied. Quantitative and qualitative analysis proved that our incrementally-domain generalized interaction detection model was able to adapt to the target domain (transoral robotic surgery) while retaining its performance in the source domain (nephrectomy surgery). Additionally, the graph model enhanced by E-CBS and T-Norm outperformed other state-of-the-art models, and the incremental DG technique performed better than the naive domain adaption and DG technique.

10.
Comput Med Imaging Graph ; 91: 101906, 2021 07.
Artigo em Inglês | MEDLINE | ID: mdl-34175548

RESUMO

The accurate prognosis of glioblastoma multiforme (GBM) plays an essential role in planning correlated surgeries and treatments. The conventional models of survival prediction rely on radiomic features using magnetic resonance imaging (MRI). In this paper, we propose a radiogenomic overall survival (OS) prediction approach by incorporating gene expression data with radiomic features such as shape, geometry, and clinical information. We exploit TCGA (The Cancer Genomic Atlas) dataset and synthesize the missing MRI modalities using a fully convolutional network (FCN) in a conditional generative adversarial network (cGAN). Meanwhile, the same FCN architecture enables the tumor segmentation from the available and the synthesized MRI modalities. The proposed FCN architecture comprises octave convolution (OctConv) and a novel decoder, with skip connections in spatial and channel squeeze & excitation (skip-scSE) block. The OctConv can process low and high-frequency features individually and improve model efficiency by reducing channel-wise redundancy. Skip-scSE applies spatial and channel-wise excitation to signify the essential features and reduces the sparsity in deeper layers learning parameters using skip connections. The proposed approaches are evaluated by comparative experiments with state-of-the-art models in synthesis, segmentation, and overall survival (OS) prediction. We observe that adding missing MRI modality improves the segmentation prediction, and expression levels of gene markers have a high contribution in the GBM prognosis prediction, and fused radiogenomic features boost the OS estimation.


Assuntos
Glioblastoma , Glioblastoma/diagnóstico por imagem , Glioblastoma/genética , Humanos , Processamento de Imagem Assistida por Computador , Imageamento por Ressonância Magnética , Prognóstico
11.
Med Image Anal ; 67: 101837, 2021 01.
Artigo em Inglês | MEDLINE | ID: mdl-33129153

RESUMO

Representation learning of the task-oriented attention while tracking instrument holds vast potential in image-guided robotic surgery. Incorporating cognitive ability to automate the camera control enables the surgeon to concentrate more on dealing with surgical instruments. The objective is to reduce the operation time and facilitate the surgery for both surgeons and patients. We propose an end-to-end trainable Spatio-Temporal Multi-Task Learning (ST-MTL) model with a shared encoder and spatio-temporal decoders for the real-time surgical instrument segmentation and task-oriented saliency detection. In the MTL model of shared-parameters, optimizing multiple loss functions into a convergence point is still an open challenge. We tackle the problem with a novel asynchronous spatio-temporal optimization (ASTO) technique by calculating independent gradients for each decoder. We also design a competitive squeeze and excitation unit by casting a skip connection that retains weak features, excites strong features, and performs dynamic spatial and channel-wise feature recalibration. To capture better long term spatio-temporal dependencies, we enhance the long-short term memory (LSTM) module by concatenating high-level encoder features of consecutive frames. We also introduce Sinkhorn regularized loss to enhance task-oriented saliency detection by preserving computational efficiency. We generate the task-aware saliency maps and scanpath of the instruments on the dataset of the MICCAI 2017 robotic instrument segmentation challenge. Compared to the state-of-the-art segmentation and saliency methods, our model outperforms most of the evaluation metrics and produces an outstanding performance in the challenge.


Assuntos
Procedimentos Cirúrgicos Robóticos , Cirurgia Assistida por Computador , Humanos , Aprendizagem , Instrumentos Cirúrgicos
12.
Med Biol Eng Comput ; 58(8): 1767-1777, 2020 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-32488372

RESUMO

Glioblastoma multiforme (GBM) is a very aggressive and infiltrative brain tumor with a high mortality rate. There are radiomic models with handcrafted features to estimate glioblastoma prognosis. In this work, we evaluate to what extent of combining genomic with radiomic features makes an impact on the prognosis of overall survival (OS) in patients with GBM. We apply a hypercolumn-based convolutional network to segment tumor regions from magnetic resonance images (MRI), extract radiomic features (geometric, shape, histogram), and fuse with gene expression profiling data to predict survival rate for each patient. Several state-of-the-art regression models such as linear regression, support vector machine, and neural network are exploited to conduct prognosis analysis. The Cancer Genome Atlas (TCGA) dataset of MRI and gene expression profiling is used in the study to observe the model performance in radiomic, genomic, and radiogenomic features. The results demonstrate that genomic data are correlated with the GBM OS prediction, and the radiogenomic model outperforms both radiomic and genomic models. We further illustrate the most significant genes, such as IL1B, KLHL4, ATP1A2, IQGAP2, and TMSL8, which contribute highly to prognosis analysis. Graphical Abstract Our Proposed fully automated "Radiogenomic"" approach for survival prediction overview. It fuses geometric, intensity, volumetric, genomic and clinical information to predict OS.


Assuntos
Neoplasias Encefálicas/mortalidade , Glioblastoma/mortalidade , Neoplasias Encefálicas/genética , Neoplasias Encefálicas/patologia , Perfilação da Expressão Gênica/métodos , Glioblastoma/genética , Glioblastoma/patologia , Humanos , Imageamento por Ressonância Magnética/métodos , Prognóstico , Taxa de Sobrevida , Transcriptoma/genética
13.
Int J Comput Assist Radiol Surg ; 15(3): 437-443, 2020 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-31960247

RESUMO

PURPOSE: Ultrasound (US)-guided percutaneous kidney biopsy is a challenge for interventionists as US artefacts prevent accurate viewing of the biopsy needle tip. Automatic needle tracking and trajectory prediction can increase operator confidence in performing biopsies, reduce procedure time, minimize the risk of inadvertent biopsy bleedings, and enable future image-guided robotic procedures. METHODS: In this paper, we propose a tracking-by-segmentation model with spatial and channel "Squeeze and Excitation" (scSE) for US needle detection and trajectory prediction. We adopt a light deep learning architecture (e.g. LinkNet) as our segmentation baseline network and integrate the scSE module to learn spatial information for better prediction. The proposed model is trained with the US images of anonymized kidney biopsy clips from 8 patients. The contour is obtained using the border-following algorithm and area calculated using Green formula. Trajectory prediction is made by extrapolating from the smallest bounding box that can capture the contour. RESULTS: We train and test our model on a total of 996 images extracted from 102 short videos at a rate of 3 frames per second from each video. A set of 794 images is used for training and 202 images for testing. Our model has achieved IOU of 41.01%, dice accuracy of 56.65%, F1-score of 36.61%, and root-mean-square angle error of 13.3[Formula: see text]. We are thus able to predict and extrapolate the trajectory of the biopsy needle with decent accuracy for interventionists to better perform biopsies. CONCLUSION: Our novel model combining LinkNet and scSE shows a promising result for kidney biopsy application, which implies potential to other similar ultrasound-guided biopsies that require needle tracking and trajectory prediction.


Assuntos
Biópsia Guiada por Imagem/métodos , Rim/patologia , Ultrassonografia de Intervenção/métodos , Algoritmos , Biópsia por Agulha/métodos , Humanos , Rim/diagnóstico por imagem , Modelos Anatômicos , Robótica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA