Búsqueda | Portal Regional de la BVS

Visual Browse and Exploration in Motion Capture Data with Phylogenetic Tree of Context-Aware Poses.

Chen, Songle; Zhao, Xuejian; Luo, Bingqing; Sun, Zhixin.

Sensors (Basel) ; 20(18)2020 Sep 13.

Artículo en Inglés | MEDLINE | ID: mdl-32933203

RESUMEN

Visual browse and exploration in motion capture data take resource acquisition as a human-computer interaction problem, and it is an essential approach for target motion search. This paper presents a progressive schema which starts from pose browse, then locates the interesting region and then switches to online relevant motion exploration. It mainly addresses three core issues. First, to alleviate the contradiction between the limited visual space and ever-increasing size of real-world database, it applies affinity propagation to numerical similarity measure of pose to perform data abstraction and obtains representative poses of clusters. Second, to construct a meaningful neighborhood for user browsing, it further merges logical similarity measures of pose with the weight quartets and casts the isolated representative poses into a structure of phylogenetic tree. Third, to support online motion exploration including motion ranking and clustering, a biLSTM-based auto-encoder is proposed to encode the high-dimensional pose context into compact latent space. Experimental results on CMU's motion capture data verify the effectiveness of the proposed method.

Asunto(s)

Algoritmos , Filogenia , Análisis por Conglomerados , Bases de Datos Factuales , Humanos , Movimiento (Física)

VERAM: View-Enhanced Recurrent Attention Model for 3D Shape Classification.

Chen, Songle; Zheng, Lintao; Zhang, Yan; Sun, Zhixin; Xu, Kai.

IEEE Trans Vis Comput Graph ; 25(12): 3244-3257, 2019 12.

Artículo en Inglés | MEDLINE | ID: mdl-30137010

RESUMEN

Multi-view deep neural network is perhaps the most successful approach in 3D shape classification. However, the fusion of multi-view features based on max or average pooling lacks a view selection mechanism, limiting its application in, e.g., multi-view active object recognition by a robot. This paper presents VERAM, a view-enhanced recurrent attention model capable of actively selecting a sequence of views for highly accurate 3D shape classification. VERAM addresses an important issue commonly found in existing attention-based models, i.e., the unbalanced training of the subnetworks corresponding to next view estimation and shape classification. The classification subnetwork is easily overfitted while the view estimation one is usually poorly trained, leading to a suboptimal classification performance. This is surmounted by three essential view-enhancement strategies: 1) enhancing the information flow of gradient backpropagation for the view estimation subnetwork, 2) devising a highly informative reward function for the reinforcement training of view estimation and 3) formulating a novel loss function that explicitly circumvents view duplication. Taking grayscale image as input and AlexNet as CNN architecture, VERAM with 9 views achieves instance-level and class-level accuracy of 95.5 and 95.3 percent on ModelNet10, 93.7 and 92.1 percent on ModelNet40, both are the state-of-the-art performance under the same number of views.

RESUMEN

Asunto(s)

RESUMEN

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA