A Light Multi-View Stereo Method with Patch-Uncertainty Awareness.

Liu, Zhen; Wu, Guangzheng; Xie, Tao; Li, Shilong; Wu, Chao; Zhang, Zhiming; Zhou, Jiali

Liu, Zhen; Wu, Guangzheng; Xie, Tao; Li, Shilong; Wu, Chao; Zhang, Zhiming; Zhou, Jiali.

Afiliação

Liu Z; College of Science, Zhejiang University of Technology, Hangzhou 310023, China.
Wu G; College of Science, Zhejiang University of Technology, Hangzhou 310023, China.
Xie T; College of Science, Zhejiang University of Technology, Hangzhou 310023, China.
Li S; College of Science, Zhejiang University of Technology, Hangzhou 310023, China.
Wu C; College of Science, Zhejiang University of Technology, Hangzhou 310023, China.
Zhang Z; Rept Battero, Wenzhou 325058, China.
Zhou J; College of Science, Zhejiang University of Technology, Hangzhou 310023, China.

Sensors (Basel) ; 24(4)2024 Feb 17.

Article em En | MEDLINE | ID: mdl-38400452

ABSTRACT

ABSTRACT

Multi-view stereo methods utilize image sequences from different views to generate a 3D point cloud model of the scene. However, existing approaches often overlook coarse-stage features, impacting the final reconstruction accuracy. Moreover, using a fixed range for all the pixels during inverse depth sampling can adversely affect depth estimation. To address these challenges, we present a novel learning-based multi-view stereo method incorporating attention mechanisms and an adaptive depth sampling strategy. Firstly, we propose a lightweight, coarse-feature-enhanced feature pyramid network in the feature extraction stage, augmented by a coarse-feature-enhanced module. This module integrates features with channel and spatial attention, enriching the contextual features that are crucial for the initial depth estimation. Secondly, we introduce a novel patch-uncertainty-based depth sampling strategy for depth refinement, dynamically configuring depth sampling ranges within the GRU-based optimization process. Furthermore, we incorporate an edge detection operator to extract edge features from the reference image's feature map. These edge features are additionally integrated into the iterative cost volume construction, enhancing the reconstruction accuracy. Lastly, our method is rigorously evaluated on the DTU and Tanks and Temples benchmark datasets, revealing its low GPU memory consumption and competitive reconstruction quality compared to other learning-based MVS methods.

Palavras-chave

attention mechanism; cost volume; depth learning; multi-view stereo

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Ano de publicação: 2024 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Ano de publicação: 2024 Tipo de documento: Article