Pesquisa | BVS - MINISTÉRIO DA SAÚDE

Surgical Gesture Recognition in Laparoscopic Tasks Based on the Transformer Network and Self-Supervised Learning.

Gazis, Athanasios; Karaiskos, Pantelis; Loukas, Constantinos.

Bioengineering (Basel) ; 9(12)2022 Nov 29.

Artigo em Inglês | MEDLINE | ID: mdl-36550943

RESUMO

In this study, we propose a deep learning framework and a self-supervision scheme for video-based surgical gesture recognition. The proposed framework is modular. First, a 3D convolutional network extracts feature vectors from video clips for encoding spatial and short-term temporal features. Second, the feature vectors are fed into a transformer network for capturing long-term temporal dependencies. Two main models are proposed, based on the backbone framework: C3DTrans (supervised) and SSC3DTrans (self-supervised). The dataset consisted of 80 videos from two basic laparoscopic tasks: peg transfer (PT) and knot tying (KT). To examine the potential of self-supervision, the models were trained on 60% and 100% of the annotated dataset. In addition, the best-performing model was evaluated on the JIGSAWS robotic surgery dataset. The best model (C3DTrans) achieves an accuracy of 88.0%, a 95.2% clip level, and 97.5% and 97.9% (gesture level), for PT and KT, respectively. The SSC3DTrans performed similar to C3DTrans when training on 60% of the annotated dataset (about 84% and 93% clip-level accuracies for PT and KT, respectively). The performance of C3DTrans on JIGSAWS was close to 76% accuracy, which was similar to or higher than prior techniques based on a single video stream, no additional video training, and online processing.

Multiple instance convolutional neural network for gallbladder assessment from laparoscopic images.

Loukas, Constantinos; Gazis, Athanasios; Schizas, Dimitrios.

Int J Med Robot ; 18(6): e2445, 2022 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-35942601

RESUMO

BACKGROUND: We present an artificial intelligence framework for vascularity classification of the gallbladder (GB) wall from intraoperative images of laparoscopic cholecystectomy (LC). METHODS: A two-stage Multiple Instance Convolutional Neural Network is proposed. First, a convolutional autoencoder is trained to extract feature representations from 4585 patches of GB images. The second model includes a multi-instance encoder that fetches random patches from a GB region and outputs an equal number of embeddings that feed a multi-input classification module, which employs pooling and self-attention mechanisms, to perform prediction. RESULTS: The evaluation was performed on 234 GB images of low and high vascularity from 68 LC videos. Thorough comparison with various state-of-the-art multi-instance and single-instance learning algorithms was performed for two experimental tasks: image- and video-level classification. The proposed framework shows the best performance with accuracy 92.6%-93.2% and F1 93.5%-93.9%, close to the agreement of two expert evaluators (94%). CONCLUSIONS: The proposed technique provides a novel approach to classify LC operations with respect to the vascular pattern of the GB wall.

Assuntos

Inteligência Artificial , Laparoscopia , Humanos , Vesícula Biliar , Redes Neurais de Computação , Algoritmos

Surgical Performance Analysis and Classification Based on Video Annotation of Laparoscopic Tasks.

Loukas, Constantinos; Gazis, Athanasios; Kanakis, Meletios A.

JSLS ; 24(4)2020.

Artigo em Inglês | MEDLINE | ID: mdl-33144823

RESUMO

BACKGROUND AND OBJECTIVES: Current approaches in surgical skills assessment employ virtual reality simulators, motion sensors, and task-specific checklists. Although accurate, these methods may be complex in the interpretation of the generated measures of performance. The aim of this study is to propose an alternative methodology for skills assessment and classification, based on video annotation of laparoscopic tasks. METHODS: Two groups of 32 trainees (students and residents) performed two laparoscopic tasks: peg transfer (PT) and knot tying (KT). Each task was annotated via a video analysis software based on a vocabulary of eight surgical gestures (surgemes) that denote the elementary gestures required to perform a task. The extracted metrics included duration/counts of each surgeme, penalty events, and counts of sequential surgemes (transitions). Our analysis focused on trainees' skill level comparison and classification using a nearest neighbor approach. The classification was assessed via accuracy, sensitivity, and specificity. RESULTS: For PT, almost all metrics showed significant performance difference between the two groups (p < 0.001). Residents were able to complete the task with fewer, shorter surgemes and fewer penalty events. Moreover, residents performed significantly fewer transitions (p < 0.05). For KT, residents performed two surgemes in significantly shorter time (p < 0.05). The metrics derived from the video annotations were also able to recognize the trainees' skill level with 0.71 - 0.86 accuracy, 0.80 - 1.00 sensitivity, and 0.60 - 0.80 specificity. CONCLUSION: The proposed technique provides a tool for skills assessment and experience classification of surgical trainees, as well as an intuitive way for describing what and how surgemes are performed.

Assuntos

Competência Clínica , Educação de Pós-Graduação em Medicina/métodos , Cirurgia Geral/educação , Laparoscopia/educação , Gravação em Vídeo , Adulto , Feminino , Humanos , Masculino , Análise e Desempenho de Tarefas , Adulto Jovem

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA