Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
1.
IEEE Trans Image Process ; 32: 6210-6222, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37943638

RESUMO

Facial expression editing has attracted increasing attention with the advance of deep neural networks in recent years. However, most existing methods suffer from compromised editing fidelity and limited usability as they either ignore pose variations (unrealistic editing) or require paired training data (not easy to collect) for pose controls. This paper presents POCE, an innovative pose-controllable expression editing network that can generate realistic facial expressions and head poses simultaneously with just unpaired training images. POCE achieves the more accessible and realistic pose-controllable expression editing by mapping face images into UV space, where facial expressions and head poses can be disentangled and edited separately. POCE has two novel designs. The first is self-supervised UV completion that allows to complete UV maps sampled under different head poses, which often suffer from self-occlusions and missing facial texture. The second is weakly-supervised UV editing that allows to generate new facial expressions with minimal modification of facial identity, where the synthesized expression could be controlled by either an expression label or directly transplanted from a reference UV map via feature transfer. Extensive experiments show that POCE can learn from unpaired face images effectively, and the learned model can generate realistic and high-fidelity facial expressions under various new poses.


Assuntos
Face , Redes Neurais de Computação , Face/diagnóstico por imagem , Expressão Facial , Humanos
2.
IEEE Trans Image Process ; 31: 1532-1544, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35015641

RESUMO

Unsupervised domain adaptation (UDA) for person re-identification is challenging because of the huge gap between the source and target domain. A typical self-training method is to use pseudo-labels generated by clustering algorithms to iteratively optimize the model on the target domain. However, a drawback to this is that noisy pseudo-labels generally cause trouble in learning. To address this problem, a mutual learning method by dual networks has been developed to produce reliable soft labels. However, as the two neural networks gradually converge, their complementarity is weakened and they likely become biased towards the same kind of noise. This paper proposes a novel light-weight module, the Attentive WaveBlock (AWB), which can be integrated into the dual networks of mutual learning to enhance the complementarity and further depress noise in the pseudo-labels. Specifically, we first introduce a parameter-free module, the WaveBlock, which creates a difference between features learned by two networks by waving blocks of feature maps differently. Then, an attention mechanism is leveraged to enlarge the difference created and discover more complementary features. Furthermore, two kinds of combination strategies, i.e. pre-attention and post-attention, are explored. Experiments demonstrate that the proposed method achieves state-of-the-art performance with significant improvements on multiple UDA person re-identification tasks. We also prove the generality of the proposed method by applying it to vehicle re-identification and image classification tasks. Our codes and models are available at: AWB.


Assuntos
Algoritmos , Redes Neurais de Computação , Análise por Conglomerados , Humanos
3.
IEEE Trans Image Process ; 30: 4046-4056, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33793400

RESUMO

Unsupervised domain adaptation for object detection is a challenging problem with many real-world applications. Unfortunately, it has received much less attention than supervised object detection. Models that try to address this task tend to suffer from a shortage of annotated training samples. Moreover, existing methods of feature alignments are not sufficient to learn domain-invariant representations. To address these limitations, we propose a novel augmented feature alignment network (AFAN) which integrates intermediate domain image generation and domain-adversarial training into a unified framework. An intermediate domain image generator is proposed to enhance feature alignments by domain-adversarial training with automatically generated soft domain labels. The synthetic intermediate domain images progressively bridge the domain divergence and augment the annotated source domain training data. A feature pyramid alignment is designed and the corresponding feature discriminator is used to align multi-scale convolutional features of different semantic levels. Last but not least, we introduce a region feature alignment and an instance discriminator to learn domain-invariant features for object proposals. Our approach significantly outperforms the state-of-the-art methods on standard benchmarks for both similar and dissimilar domain adaptations. Further extensive experiments verify the effectiveness of each component and demonstrate that the proposed network can learn domain-invariant representations.

4.
Artigo em Inglês | MEDLINE | ID: mdl-31535991

RESUMO

Though Faster R-CNN based two-stage detectors have witnessed significant boost in pedestrian detection accuracy, they are still slow for practical applications. One solution is to simplify this working flow as a single-stage detector. However, current single-stage detectors (e.g. SSD) have not presented competitive accuracy on common pedestrian detection benchmarks. Accordingly, a structurally simple but effective module called Asymptotic Localization Fitting (ALF) is proposed, which stacks a series of predictors to directly evolve the default anchor boxes of SSD step by step to improve detection results. Additionally, combining the advantages from residual learning and multi-scale context encoding, a bottleneck block is proposed to enhance the predictors' discriminative power. On top of the above designs, an efficient single-stage detection architecture is designed, resulting in an attractive pedestrian detector in both accuracy and speed. A comprehensive set of experiments on two of the largest pedestrian detection datasets (i.e. CityPersons and Caltech) demonstrate the superiority of the proposed method, comparing to the state of the arts on both the benchmarks.

5.
IEEE Trans Pattern Anal Mach Intell ; 38(2): 211-23, 2016 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-26761729

RESUMO

We propose a method to address challenges in unconstrained face detection, such as arbitrary pose variations and occlusions. First, a new image feature called Normalized Pixel Difference (NPD) is proposed. NPD feature is computed as the difference to sum ratio between two pixel values, inspired by the Weber Fraction in experimental psychology. The new feature is scale invariant, bounded, and is able to reconstruct the original image. Second, we propose a deep quadratic tree to learn the optimal subset of NPD features and their combinations, so that complex face manifolds can be partitioned by the learned rules. This way, only a single soft-cascade classifier is needed to handle unconstrained face detection. Furthermore, we show that the NPD features can be efficiently obtained from a look up table, and the detection template can be easily scaled, making the proposed face detector very fast. Experimental results on three public face datasets (FDDB, GENKI, and CMU-MIT) show that the proposed method achieves state-of-the-art performance in detecting unconstrained faces with arbitrary pose variations and occlusions in cluttered scenes.


Assuntos
Identificação Biométrica/métodos , Face/anatomia & histologia , Processamento de Imagem Assistida por Computador/métodos , Software , Algoritmos , Bases de Dados Factuais , Humanos , Aprendizado de Máquina , Curva ROC
6.
IEEE Trans Pattern Anal Mach Intell ; 35(5): 1193-205, 2013 May.
Artigo em Inglês | MEDLINE | ID: mdl-23520259

RESUMO

Numerous methods have been developed for holistic face recognition with impressive performance. However, few studies have tackled how to recognize an arbitrary patch of a face image. Partial faces frequently appear in unconstrained scenarios, with images captured by surveillance cameras or handheld devices (e.g., mobile phones) in particular. In this paper, we propose a general partial face recognition approach that does not require face alignment by eye coordinates or any other fiducial points. We develop an alignment-free face representation method based on Multi-Keypoint Descriptors (MKD), where the descriptor size of a face is determined by the actual content of the image. In this way, any probe face image, holistic or partial, can be sparsely represented by a large dictionary of gallery descriptors. A new keypoint descriptor called Gabor Ternary Pattern (GTP) is also developed for robust and discriminative face recognition. Experimental results are reported on four public domain face databases (FRGCv2.0, AR, LFW, and PubFig) under both the open-set identification and verification scenarios. Comparisons with two leading commercial face recognition SDKs (PittPatt and FaceVACS) and two baseline algorithms (PCA+LDA and LBP) show that the proposed method, overall, is superior in recognizing both holistic and partial faces without requiring alignment.


Assuntos
Identificação Biométrica/métodos , Face/anatomia & histologia , Algoritmos , Bases de Dados Factuais , Humanos , Processamento de Imagem Assistida por Computador , Postura
7.
IEEE Trans Image Process ; 20(1): 247-56, 2011 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-20643604

RESUMO

Information jointly contained in image space, scale and orientation domains can provide rich important clues not seen in either individual of these domains. The position, spatial frequency and orientation selectivity properties are believed to have an important role in visual perception. This paper proposes a novel face representation and recognition approach by exploring information jointly in image space, scale and orientation domains. Specifically, the face image is first decomposed into different scale and orientation responses by convolving multiscale and multiorientation Gabor filters. Second, local binary pattern analysis is used to describe the neighboring relationship not only in image space, but also in different scale and orientation responses. This way, information from different domains is explored to give a good face representation for recognition. Discriminant classification is then performed based upon weighted histogram intersection or conditional mutual information with linear discriminant analysis techniques. Extensive experimental results on FERET, AR, and FRGC ver 2.0 databases show the significant advantages of the proposed method over the existing ones.


Assuntos
Algoritmos , Identificação Biométrica/métodos , Processamento de Imagem Assistida por Computador/métodos , Inteligência Artificial , Análise Discriminante , Face/anatomia & histologia , Expressão Facial , Humanos , Curva ROC
8.
IEEE Trans Pattern Anal Mach Intell ; 29(4): 627-39, 2007 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-17299220

RESUMO

Most current face recognition systems are designed for indoor, cooperative-user applications. However, even in thus-constrained applications, most existing systems, academic and commercial, are compromised in accuracy by changes in environmental illumination. In this paper, we present a novel solution for illumination invariant face recognition for indoor, cooperative-user applications. First, we present an active near infrared (NIR) imaging system that is able to produce face images of good condition regardless of visible lights in the environment. Second, we show that the resulting face images encode intrinsic information of the face, subject only to a monotonic transform in the gray tone; based on this, we use local binary pattern (LBP) features to compensate for the monotonic transform, thus deriving an illumination invariant face representation. Then, we present methods for face recognition using NIR images; statistical learning algorithms are used to extract most discriminative features from a large pool of invariant LBP features and construct a highly accurate face matching engine. Finally, we present a system that is able to achieve accurate and fast face recognition in practice, in which a method is provided to deal with specular reflections of active NIR lights on eyeglasses, a critical issue in active NIR image-based face recognition. Extensive, comparative results are provided to evaluate the imaging hardware, the face and eye detection algorithms, and the face recognition algorithms and systems, with respect to various factors, including illumination, eyeglasses, time lapse, and ethnic groups.


Assuntos
Inteligência Artificial , Biometria/métodos , Face/fisiologia , Reconhecimento Automatizado de Padrão/métodos , Fenômenos Fisiológicos da Pele , Espectrofotometria Infravermelho/métodos , Termografia/métodos , Algoritmos , Simulação por Computador , Humanos , Iluminação/métodos , Modelos Biológicos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA