DTS-Net: Depth-to-Space Networks for Fast and Accurate Semantic Object Segmentation.

Ibrahem, Hatem; Salem, Ahmed; Kang, Hyun-Soo

Ibrahem, Hatem; Salem, Ahmed; Kang, Hyun-Soo.

Afiliação

Ibrahem H; Department of Information and Communication Engineering, School of Electrical and Computer Engineering, Chungbuk National University, Cheongju-si 28644, Korea.
Salem A; Department of Information and Communication Engineering, School of Electrical and Computer Engineering, Chungbuk National University, Cheongju-si 28644, Korea.
Kang HS; Electrical Engineering Department, Faculty of Engineering, Assiut University, Assiut 71515, Egypt.

Sensors (Basel) ; 22(1)2022 Jan 03.

Article em En | MEDLINE | ID: mdl-35009879

ABSTRACT

ABSTRACT

We propose Depth-to-Space Net (DTS-Net), an effective technique for semantic segmentation using the efficient sub-pixel convolutional neural network. This technique is inspired by depth-to-space (DTS) image reconstruction, which was originally used for image and video super-resolution tasks, combined with a mask enhancement filtration technique based on multi-label classification, namely, Nearest Label Filtration. In the proposed technique, we employ depth-wise separable convolution-based architectures. We propose both a deep network, that is, DTS-Net, and a lightweight network, DTS-Net-Lite, for real-time semantic segmentation; these networks employ Xception and MobileNetV2 architectures as the feature extractors, respectively. In addition, we explore the joint semantic segmentation and depth estimation task and demonstrate that the proposed technique can efficiently perform both tasks simultaneously, outperforming state-of-art (SOTA) methods. We train and evaluate the performance of the proposed method on the PASCAL VOC2012, NYUV2, and CITYSCAPES benchmarks. Hence, we obtain high mean intersection over union (mIOU) and mean pixel accuracy (Pix.acc.) values using simple and lightweight convolutional neural network architectures of the developed networks. Notably, the proposed method outperforms SOTA methods that depend on encoder-decoder architectures, although our implementation and computations are far simpler.

Palavras-chave

convolutional neural networks; real-time computer vision; semantic segmentation

Texto completo

Adicionar na Minha BVS

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Tipo de estudo: Clinical_trials Idioma: En Revista: Sensors (Basel) Ano de publicação: 2022 Tipo de documento: Article

Texto completo

Adicionar na Minha BVS

Imprimir

XML

PubMed Links

Buscar no Google