Pesquisa | Portal Regional da BVS

Understanding Pixel-Level 2D Image Semantics With 3D Keypoint Knowledge Engine.

You, Yang; Li, Chengkun; Lou, Yujing; Cheng, Zhoujun; Li, Liangwei; Ma, Lizhuang; Wang, Weiming; Lu, Cewu.

IEEE Trans Pattern Anal Mach Intell ; 44(9): 5780-5795, 2022 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-33848241

RESUMO

Pixel-level 2D object semantic understanding is an important topic in computer vision and could help machine deeply understand objects (e.g., functionality and affordance) in our daily life. However, most previous methods directly train on correspondences in 2D images, which is end-to-end but loses plenty of information in 3D spaces. In this paper, we propose a new method on predicting image corresponding semantics in 3D domain and then projecting them back onto 2D images to achieve pixel-level understanding. In order to obtain reliable 3D semantic labels that are absent in current image datasets, we build a large scale keypoint knowledge engine called KeypointNet, which contains 103,450 keypoints and 8,234 3D models from 16 object categories. Our method leverages the advantages in 3D vision and can explicitly reason about objects self-occlusion and visibility. We show that our method gives comparative and even superior results on standard semantic benchmarks.

PRIN/SPRIN: On Extracting Point-Wise Rotation Invariant Features.

You, Yang; Lou, Yujing; Shi, Ruoxi; Liu, Qi; Tai, Yu-Wing; Ma, Lizhuang; Wang, Weiming; Lu, Cewu.

IEEE Trans Pattern Anal Mach Intell ; 44(12): 9489-9502, 2022 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-34822324

RESUMO

Point cloud analysis without pose priors is very challenging in real applications, as the orientations of point clouds are often unknown. In this paper, we propose a brand new point-set learning framework PRIN, namely, Point-wise Rotation Invariant Network, focusing on rotation invariant feature extraction in point clouds analysis. We construct spherical signals by Density Aware Adaptive Sampling to deal with distorted point distributions in spherical space. Spherical Voxel Convolution and Point Re-sampling are proposed to extract rotation invariant features for each point. In addition, we extend PRIN to a sparse version called SPRIN, which directly operates on sparse point clouds. Both PRIN and SPRIN can be applied to tasks ranging from object classification, part segmentation, to 3D feature matching and label alignment. Results show that, on the dataset with randomly rotated point clouds, SPRIN demonstrates better performance than state-of-the-art methods without any data augmentation. We also provide thorough theoretical proof and analysis for point-wise rotation invariance achieved by our methods. The code to reproduce our results will be made publicly available.

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA