Pesquisa | BVS - MINISTÉRIO DA SAÚDE

Deep learning for surgical phase recognition using endoscopic videos.

Guédon, Annetje C P; Meij, Senna E P; Osman, Karim N M M H; Kloosterman, Helena A; van Stralen, Karlijn J; Grimbergen, Matthijs C M; Eijsbouts, Quirijn A J; van den Dobbelsteen, John J; Twinanda, Andru P.

Surg Endosc ; 35(11): 6150-6157, 2021 11.

Artigo em Inglês | MEDLINE | ID: mdl-33237461

RESUMO

BACKGROUND: Operating room planning is a complex task as pre-operative estimations of procedure duration have a limited accuracy. This is due to large variations in the course of procedures. Therefore, information about the progress of procedures is essential to adapt the daily operating room schedule accordingly. This information should ideally be objective, automatically retrievable and in real-time. Recordings made during endoscopic surgeries are a potential source of progress information. A trained observer is able to recognize the ongoing surgical phase from watching these videos. The introduction of deep learning techniques brought up opportunities to automatically retrieve information from surgical videos. The aim of this study was to apply state-of-the art deep learning techniques on a new set of endoscopic videos to automatically recognize the progress of a procedure, and to assess the feasibility of the approach in terms of performance, scalability and practical considerations. METHODS: A dataset of 33 laparoscopic cholecystectomies (LC) and 35 total laparoscopic hysterectomies (TLH) was used. The surgical tools that were used and the ongoing surgical phases were annotated in the recordings. Neural networks were trained on a subset of annotated videos. The automatic recognition of surgical tools and phases was then assessed on another subset. The scalability of the networks was tested and practical considerations were kept up. RESULTS: The performance of the surgical tools and phase recognition reached an average precision and recall between 0.77 and 0.89. The scalability tests showed diverging results. Legal considerations had to be taken into account and a considerable amount of time was needed to annotate the datasets. CONCLUSION: This study shows the potential of deep learning to automatically recognize information contained in surgical videos. This study also provides insights in the applicability of such a technique to support operating room planning.

Assuntos

Colecistectomia Laparoscópica , Aprendizado Profundo , Laparoscopia , Humanos , Redes Neurais de Computação

EndoNet: A Deep Architecture for Recognition Tasks on Laparoscopic Videos.

Twinanda, Andru P; Shehata, Sherif; Mutter, Didier; Marescaux, Jacques; de Mathelin, Michel; Padoy, Nicolas.

IEEE Trans Med Imaging ; 36(1): 86-97, 2017 01.

Artigo em Inglês | MEDLINE | ID: mdl-27455522

RESUMO

Surgical workflow recognition has numerous potential medical applications, such as the automatic indexing of surgical video databases and the optimization of real-time operating room scheduling, among others. As a result, surgical phase recognition has been studied in the context of several kinds of surgeries, such as cataract, neurological, and laparoscopic surgeries. In the literature, two types of features are typically used to perform this task: visual features and tool usage signals. However, the used visual features are mostly handcrafted. Furthermore, the tool usage signals are usually collected via a manual annotation process or by using additional equipment. In this paper, we propose a novel method for phase recognition that uses a convolutional neural network (CNN) to automatically learn features from cholecystectomy videos and that relies uniquely on visual information. In previous studies, it has been shown that the tool usage signals can provide valuable information in performing the phase recognition task. Thus, we present a novel CNN architecture, called EndoNet, that is designed to carry out the phase recognition and tool presence detection tasks in a multi-task manner. To the best of our knowledge, this is the first work proposing to use a CNN for multiple recognition tasks on laparoscopic videos. Experimental comparisons to other methods show that EndoNet yields state-of-the-art results for both tasks.

Assuntos

Laparoscopia , Algoritmos , Bases de Dados Factuais , Redes Neurais de Computação

Erratum to: Data-driven spatio-temporal RGBD feature encoding for action recognition in operating rooms.

Twinanda, Andru P; Alkan, Emre O; Gangi, Afshin; de Mathelin, Michel; Padoy, Nicolas.

Int J Comput Assist Radiol Surg ; 10(7): 1177, 2015 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-26067290

Data-driven spatio-temporal RGBD feature encoding for action recognition in operating rooms.

Twinanda, Andru P; Alkan, Emre O; Gangi, Afshin; de Mathelin, Michel; Padoy, Nicolas.

Int J Comput Assist Radiol Surg ; 10(6): 737-47, 2015 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-25847670

RESUMO

PURPOSE: Context-aware systems for the operating room (OR) provide the possibility to significantly improve surgical workflow through various applications such as efficient OR scheduling, context-sensitive user interfaces, and automatic transcription of medical procedures. Being an essential element of such a system, surgical action recognition is thus an important research area. In this paper, we tackle the problem of classifying surgical actions from video clips that capture the activities taking place in the OR. METHODS: We acquire recordings using a multi-view RGBD camera system mounted on the ceiling of a hybrid OR dedicated to X-ray-based procedures and annotate clips of the recordings with the corresponding actions. To recognize the surgical actions from the video clips, we use a classification pipeline based on the bag-of-words (BoW) approach. We propose a novel feature encoding method that extends the classical BoW approach. Instead of using the typical rigid grid layout to divide the space of the feature locations, we propose to learn the layout from the actual 4D spatio-temporal locations of the visual features. This results in a data-driven and non-rigid layout which retains more spatio-temporal information compared to the rigid counterpart. RESULTS: We classify multi-view video clips from a new dataset generated from 11-day recordings of real operations. This dataset is composed of 1734 video clips of 15 actions. These include generic actions (e.g., moving patient to the OR bed) and actions specific to the vertebroplasty procedure (e.g., hammering). The experiments show that the proposed non-rigid feature encoding method performs better than the rigid encoding one. The classifier's accuracy is increased by over 4 %, from 81.08 to 85.53 %. CONCLUSION: The combination of both intensity and depth information from the RGBD data provides more discriminative power in carrying out the surgical action recognition task as compared to using either one of them alone. Furthermore, the proposed non-rigid spatio-temporal feature encoding scheme provides more discriminative histogram representations than the rigid counterpart. To the best of our knowledge, this is also the first work that presents action recognition results on multi-view RGBD data recorded in the OR.

Assuntos

Sistemas Computacionais , Salas Cirúrgicas , Cirurgia Assistida por Computador , Algoritmos , Humanos , Gravação em Vídeo

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA