Classification of endoscopic image and video frames using distance metric-based learning with interpolated latent features.

Sedighipour Chafjiri, Fatemeh; Mohebbian, Mohammad Reza; Wahid, Khan A; Babyn, Paul

Sedighipour Chafjiri, Fatemeh; Mohebbian, Mohammad Reza; Wahid, Khan A; Babyn, Paul.

Afiliação

Sedighipour Chafjiri F; Department of Electrical and Computer Engineering, University of Saskatchewan, Saskatoon, Saskatchewan S7N 5A9 Canada.
Mohebbian MR; Department of Electrical and Computer Engineering, University of Saskatchewan, Saskatoon, Saskatchewan S7N 5A9 Canada.
Wahid KA; Department of Electrical and Computer Engineering, University of Saskatchewan, Saskatoon, Saskatchewan S7N 5A9 Canada.
Babyn P; Department of Medical Imaging, University of Saskatchewan and Saskatchewan Health Authority, Saskatoon, SK S7K 0M7 Canada.

Multimed Tools Appl ; : 1-22, 2023 Mar 17.

Article em En | MEDLINE | ID: mdl-37362715

ABSTRACT

ABSTRACT

Conventional Endoscopy (CE) and Wireless Capsule Endoscopy (WCE) are well known tools for diagnosing gastrointestinal (GI) tract related disorders. Defining the anatomical location within the GI tract helps clinicians determine appropriate treatment options, which can reduce the need for repetitive endoscopy. Limited research addresses the localization of the anatomical location of WCE and CE images using classification, mainly due to the difficulty in collecting annotated data. In this study, we present a few-shot learning method based on distance metric learning which combines transfer-learning and manifold mixup schemes to localize and classify endoscopic images and video frames. The proposed method allows us to develop a pipeline for endoscopy video sequence localization that can be trained with only a few samples. The use of manifold mixup improves learning by increasing the number of training epochs while reducing overfitting and providing more accurate decision boundaries. A dataset is collected from 10 different anatomical positions of the human GI tract. Two models were trained using only 78 CE and 27 WCE annotated frames to predict the location of 25,700 and 1825 video frames from CE and WCE respectively. We performed subjective evaluation using nine gastroenterologists to validate the need of having such an automated system to localize endoscopic images and video frames. Our method achieved higher accuracy and a higher F1-score when compared with the scores from subjective evaluation. In addition, the results show improved performance with less cross-entropy loss when compared with several existing methods trained on the same datasets. This indicates that the proposed method has the potential to be used in endoscopy image classification. Supplementary Information The online version contains supplementary material available at 10.1007/s11042-023-14982-1.

Palavras-chave

Classification; Endoscopy; Few shot learning; GI track anatomic locations; Manifold mix-up; Siamese neural network

Texto completo

Adicionar na Minha BVS

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Tipo de estudo: Prognostic_studies Idioma: En Revista: Multimed Tools Appl Ano de publicação: 2023 Tipo de documento: Article

Texto completo

Adicionar na Minha BVS

Imprimir

XML

PubMed Links

Buscar no Google