Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Mais filtros

Base de dados
Tipo de documento
Assunto da revista
País de afiliação
Intervalo de ano de publicação
1.
IEEE Trans Pattern Anal Mach Intell ; 45(11): 13314-13327, 2023 11.
Artigo em Inglês | MEDLINE | ID: mdl-37399164

RESUMO

Deep learning architectures, albeit successful in most computer vision tasks, were designed for data with an underlying Euclidean structure, which is not usually fulfilled since pre-processed data may lie on a non-linear space. In this article, we propose a geometric deep learning approach using rigid and non-rigid transformations, named KShapenet, for 2D and 3D landmark-based human motion analysis. Landmark configuration sequences are first modeled as trajectories on Kendall's shape space and then mapped to a linear tangent space. The resulting structured data are then input to a deep learning architecture, which includes a layer that optimizes over rigid and non-rigid transformations of landmark configurations, followed by a CNN-LSTM network. We apply KShapenet to 3D human landmark sequences for action and gait recognition, and 2D facial landmark sequences for expression recognition, and demonstrate the competitiveness of the proposed approach with respect to state-of-the-art.


Assuntos
Algoritmos , Redes Neurais de Computação , Humanos
2.
IEEE Trans Pattern Anal Mach Intell ; 42(10): 2594-2607, 2020 10.
Artigo em Inglês | MEDLINE | ID: mdl-31395537

RESUMO

The detection and tracking of human landmarks in video streams has gained in reliability partly due to the availability of affordable RGB-D sensors. The analysis of such time-varying geometric data is playing an important role in the automatic human behavior understanding. However, suitable shape representations as well as their temporal evolution, termed trajectories, often lie to nonlinear manifolds. This puts an additional constraint (i.e., nonlinearity) in using conventional Machine Learning techniques. As a solution, this paper accommodates the well-known Sparse Coding and Dictionary Learning approach to study time-varying shapes on the Kendall shape spaces of 2D and 3D landmarks. We illustrate effective coding of 3D skeletal sequences for action recognition and 2D facial landmark sequences for macro- and micro-expression recognition. To overcome the inherent nonlinearity of the shape spaces, intrinsic and extrinsic solutions were explored. As main results, shape trajectories give rise to more discriminative time-series with suitable computational properties, including sparsity and vector space structure. Extensive experiments conducted on commonly-used datasets demonstrate the competitiveness of the proposed approaches with respect to state-of-the-art.

3.
IEEE Trans Pattern Anal Mach Intell ; 38(1): 46-59, 2016 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-26656577

RESUMO

This paper describes a novel framework for computing geodesic paths in shape spaces of spherical surfaces under an elastic Riemannian metric. The novelty lies in defining this Riemannian metric directly on the quotient (shape) space, rather than inheriting it from pre-shape space, and using it to formulate a path energy that measures only the normal components of velocities along the path. In other words, this paper defines and solves for geodesics directly on the shape space and avoids complications resulting from the quotient operation. This comprehensive framework is invariant to arbitrary parameterizations of surfaces along paths, a phenomenon termed as gauge invariance. Additionally, this paper makes a link between different elastic metrics used in the computer science literature on one hand, and the mathematical literature on the other hand, and provides a geometrical interpretation of the terms involved. Examples using real and simulated 3D objects are provided to help illustrate the main ideas.

4.
IEEE Trans Cybern ; 44(12): 2443-57, 2014 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-25415949

RESUMO

In this paper, we present an automatic approach for facial expression recognition from 3-D video sequences. In the proposed solution, the 3-D faces are represented by collections of radial curves and a Riemannian shape analysis is applied to effectively quantify the deformations induced by the facial expressions in a given subsequence of 3-D frames. This is obtained from the dense scalar field, which denotes the shooting directions of the geodesic paths constructed between pairs of corresponding radial curves of two faces. As the resulting dense scalar fields show a high dimensionality, Linear Discriminant Analysis (LDA) transformation is applied to the dense feature space. Two methods are then used for classification: 1) 3-D motion extraction with temporal Hidden Markov model (HMM) and 2) mean deformation capturing with random forest. While a dynamic HMM on the features is trained in the first approach, the second one computes mean deformations under a window and applies multiclass random forest. Both of the proposed classification schemes on the scalar fields showed comparable results and outperformed earlier studies on facial expression recognition from 3-D video sequences.


Assuntos
Inteligência Artificial , Biometria/métodos , Face/anatomia & histologia , Expressão Facial , Imageamento Tridimensional/métodos , Reconhecimento Automatizado de Padrão/métodos , Algoritmos , Aumento da Imagem/métodos , Interpretação de Imagem Assistida por Computador/métodos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Gravação em Vídeo/métodos
5.
IEEE Trans Pattern Anal Mach Intell ; 35(9): 2270-83, 2013 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-23868784

RESUMO

We propose a novel geometric framework for analyzing 3D faces, with the specific goals of comparing, matching, and averaging their shapes. Here we represent facial surfaces by radial curves emanating from the nose tips and use elastic shape analysis of these curves to develop a Riemannian framework for analyzing shapes of full facial surfaces. This representation, along with the elastic Riemannian metric, seems natural for measuring facial deformations and is robust to challenges such as large facial expressions (especially those with open mouths), large pose variations, missing parts, and partial occlusions due to glasses, hair, and so on. This framework is shown to be promising from both--empirical and theoretical--perspectives. In terms of the empirical evaluation, our results match or improve upon the state-of-the-art methods on three prominent databases: FRGCv2, GavabDB, and Bosphorus, each posing a different type of challenge. From a theoretical perspective, this framework allows for formal statistical inferences, such as the estimation of missing facial parts using PCA on tangent spaces and computing average shapes.


Assuntos
Identificação Biométrica/métodos , Face/anatomia & histologia , Adulto , Bases de Dados Factuais , Elasticidade , Óculos , Expressão Facial , Feminino , Cabelo/anatomia & histologia , Mãos/anatomia & histologia , Humanos , Imageamento Tridimensional , Masculino , Postura
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA