A Comparative Review on Enhancing Visual Simultaneous Localization and Mapping with Deep Semantic Segmentation.

Liu, Xiwen; He, Yong; Li, Jue; Yan, Rui; Li, Xiaoyu; Huang, Hui

Liu, Xiwen; He, Yong; Li, Jue; Yan, Rui; Li, Xiaoyu; Huang, Hui.

Afiliação

Liu X; Key Laboratory of Urban Land Resources Monitoring and Simulation, Ministry of Natura Resources, Shenzhen 518034, China.
He Y; School of Smart City, Chongqing Jiaotong University, Chongqing 400074, China.
Li J; School of Smart City, Chongqing Jiaotong University, Chongqing 400074, China.
Yan R; College of Traffic & Transportation, Chongqing Jiaotong University, Chongqing 400074, China.
Li X; School of Smart City, Chongqing Jiaotong University, Chongqing 400074, China.
Huang H; School of Smart City, Chongqing Jiaotong University, Chongqing 400074, China.

Sensors (Basel) ; 24(11)2024 May 24.

Article em En | MEDLINE | ID: mdl-38894177

ABSTRACT

ABSTRACT

Visual simultaneous localization and mapping (VSLAM) enhances the navigation of autonomous agents in unfamiliar environments by progressively constructing maps and estimating poses. However, conventional VSLAM pipelines often exhibited degraded performance in dynamic environments featuring mobile objects. Recent research in deep learning led to notable progress in semantic segmentation, which involves assigning semantic labels to image pixels. The integration of semantic segmentation into VSLAM can effectively differentiate between static and dynamic elements in intricate scenes. This paper provided a comprehensive comparative review on leveraging semantic segmentation to improve major components of VSLAM, including visual odometry, loop closure detection, and environmental mapping. Key principles and methods for both traditional VSLAM and deep semantic segmentation were introduced. This paper presented an overview and comparative analysis of the technical implementations of semantic integration across various modules of the VSLAM pipeline. Furthermore, it examined the features and potential use cases associated with the fusion of VSLAM and semantics. It was found that the existing VSLAM model continued to face challenges related to computational complexity. Promising future research directions were identified, including efficient model design, multimodal fusion, online adaptation, dynamic scene reconstruction, and end-to-end joint optimization. This review shed light on the emerging paradigm of semantic VSLAM and how deep learning-enabled semantic reasoning could unlock new capabilities for autonomous intelligent systems to operate reliably in the real world.

Palavras-chave

comparative review; deep learning; dynamic environments; semantic segmentation; visual simultaneous localization and mapping

Texto completo

Adicionar na Minha BVS

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: Sensors (Basel) Ano de publicação: 2024 Tipo de documento: Article País de afiliação: China

Texto completo

Adicionar na Minha BVS

Imprimir

XML

PubMed Links

Buscar no Google