A Light Multi-View Stereo Method with Patch-Uncertainty Awareness.

Liu, Zhen; Wu, Guangzheng; Xie, Tao; Li, Shilong; Wu, Chao; Zhang, Zhiming; Zhou, Jiali

Liu, Zhen; Wu, Guangzheng; Xie, Tao; Li, Shilong; Wu, Chao; Zhang, Zhiming; Zhou, Jiali.

Afiliación

Liu Z; College of Science, Zhejiang University of Technology, Hangzhou 310023, China.
Wu G; College of Science, Zhejiang University of Technology, Hangzhou 310023, China.
Xie T; College of Science, Zhejiang University of Technology, Hangzhou 310023, China.
Li S; College of Science, Zhejiang University of Technology, Hangzhou 310023, China.
Wu C; College of Science, Zhejiang University of Technology, Hangzhou 310023, China.
Zhang Z; Rept Battero, Wenzhou 325058, China.
Zhou J; College of Science, Zhejiang University of Technology, Hangzhou 310023, China.

Sensors (Basel) ; 24(4)2024 Feb 17.

Article en En | MEDLINE | ID: mdl-38400452

ABSTRACT

ABSTRACT

Multi-view stereo methods utilize image sequences from different views to generate a 3D point cloud model of the scene. However, existing approaches often overlook coarse-stage features, impacting the final reconstruction accuracy. Moreover, using a fixed range for all the pixels during inverse depth sampling can adversely affect depth estimation. To address these challenges, we present a novel learning-based multi-view stereo method incorporating attention mechanisms and an adaptive depth sampling strategy. Firstly, we propose a lightweight, coarse-feature-enhanced feature pyramid network in the feature extraction stage, augmented by a coarse-feature-enhanced module. This module integrates features with channel and spatial attention, enriching the contextual features that are crucial for the initial depth estimation. Secondly, we introduce a novel patch-uncertainty-based depth sampling strategy for depth refinement, dynamically configuring depth sampling ranges within the GRU-based optimization process. Furthermore, we incorporate an edge detection operator to extract edge features from the reference image's feature map. These edge features are additionally integrated into the iterative cost volume construction, enhancing the reconstruction accuracy. Lastly, our method is rigorously evaluated on the DTU and Tanks and Temples benchmark datasets, revealing its low GPU memory consumption and competitive reconstruction quality compared to other learning-based MVS methods.

Palabras clave

attention mechanism; cost volume; depth learning; multi-view stereo

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Idioma: En Revista: Sensors (Basel) Año: 2024 Tipo del documento: Article País de afiliación: China

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google