Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 27
Filtrar
1.
IEEE Trans Neural Netw Learn Syst ; 34(2): 973-986, 2023 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-34432638

RESUMO

Most existing multiview clustering methods are based on the original feature space. However, the feature redundancy and noise in the original feature space limit their clustering performance. Aiming at addressing this problem, some multiview clustering methods learn the latent data representation linearly, while performance may decline if the relation between the latent data representation and the original data is nonlinear. The other methods which nonlinearly learn the latent data representation usually conduct the latent representation learning and clustering separately, resulting in that the latent data representation might be not well adapted to clustering. Furthermore, none of them model the intercluster relation and intracluster correlation of data points, which limits the quality of the learned latent data representation and therefore influences the clustering performance. To solve these problems, this article proposes a novel multiview clustering method via proximity learning in latent representation space, named multiview latent proximity learning (MLPL). For one thing, MLPL learns the latent data representation in a nonlinear manner which takes the intercluster relation and intracluster correlation into consideration simultaneously. For another, through conducting the latent representation learning and consensus proximity learning simultaneously, MLPL learns a consensus proximity matrix with k connected components to output the clustering result directly. Extensive experiments are conducted on seven real-world datasets to demonstrate the effectiveness and superiority of the MLPL method compared with the state-of-the-art multiview clustering methods.

2.
IEEE Trans Pattern Anal Mach Intell ; 45(1): 489-507, 2023 01.
Artigo em Inglês | MEDLINE | ID: mdl-35130146

RESUMO

Egocentric videos, which record the daily activities of individuals from a first-person point of view, have attracted increasing attention during recent years because of their growing use in many popular applications, including life logging, health monitoring and virtual reality. As a fundamental problem in egocentric vision, one of the tasks of egocentric action recognition aims to recognize the actions of the camera wearers from egocentric videos. In egocentric action recognition, relation modeling is important, because the interactions between the camera wearer and the recorded persons or objects form complex relations in egocentric videos. However, only a few of existing methods model the relations between the camera wearer and the interacting persons for egocentric action recognition, and moreover they require prior knowledge or auxiliary data to localize the interacting persons. In this work, we consider modeling the relations in a weakly supervised manner, i.e., without using annotations or prior knowledge about the interacting persons or objects, for egocentric action recognition. We form a weakly supervised framework by unifying automatic interactor localization and explicit relation modeling for the purpose of automatic relation modeling. First, we learn to automatically localize the interactors, i.e., the body parts of the camera wearer and the persons or objects that the camera wearer interacts with, by learning a series of keypoints directly from video data to localize the action-relevant regions with only action labels and some constraints on these keypoints. Second, more importantly, to explicitly model the relations between the interactors, we develop an ego-relational LSTM (long short-term memory) network with several candidate connections to model the complex relations in egocentric videos, such as the temporal, interactive, and contextual relations. In particular, to reduce human efforts and manual interventions needed to construct an optimal ego-relational LSTM structure, we search for the optimal connections by employing a differentiable network architecture search mechanism, which automatically constructs the ego-relational LSTM network to explicitly model different relations for egocentric action recognition. We conduct extensive experiments on egocentric video datasets to illustrate the effectiveness of our method.


Assuntos
Algoritmos , Realidade Virtual , Humanos , Aprendizagem
3.
IEEE Trans Neural Netw Learn Syst ; 34(12): 9671-9684, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-35324448

RESUMO

Session-based recommendation tries to make use of anonymous session data to deliver high-quality recommendations under the condition that user profiles and the complete historical behavioral data of a target user are unavailable. Previous works consider each session individually and try to capture user interests within a session. Despite their encouraging results, these models can only perceive intra-session items and cannot draw upon the massive historical relational information. To solve this problem, we propose a novel method named global graph guided session-based recommendation (G3SR). G3SR decomposes the session-based recommendation workflow into two steps. First, a global graph is built upon all session data, from which the global item representations are learned in an unsupervised manner. Then, these representations are refined on session graphs under the graph networks, and a readout function is used to generate session representations for each session. Extensive experiments on two real-world benchmark datasets show remarkable and consistent improvements of the G3SR method over the state-of-the-art methods, especially for cold items.

4.
IEEE Trans Cybern ; 53(10): 6636-6648, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-37021985

RESUMO

Multiparty learning is an indispensable technique to improve the learning performance via integrating data from multiple parties. Unfortunately, directly integrating multiparty data could not meet the privacy-preserving requirements, which then induces the development of privacy-preserving machine learning (PPML), a key research task in multiparty learning. Despite this, the existing PPML methods generally cannot simultaneously meet multiple requirements, such as security, accuracy, efficiency, and application scope. To deal with the aforementioned problems, in this article, we present a new PPML method based on the secure multiparty interactive protocol, namely, the multiparty secure broad learning system (MSBLS) and derive its security analysis. To be specific, the proposed method employs the interactive protocol and random mapping to generate the mapped features of data, and then uses efficient broad learning to train the neural network classifier. To the best of our knowledge, this is the first attempt for privacy computing method that jointly combines secure multiparty computing and neural network. Theoretically, this method can ensure that the accuracy of the model will not be reduced due to encryption, and the calculation speed is very fast. Three classical datasets are adopted to verify our conclusion.

5.
IEEE Trans Cybern ; 52(11): 12231-12244, 2022 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-33961570

RESUMO

The rapid emergence of high-dimensional data in various areas has brought new challenges to current ensemble clustering research. To deal with the curse of dimensionality, recently considerable efforts in ensemble clustering have been made by means of different subspace-based techniques. However, besides the emphasis on subspaces, rather limited attention has been paid to the potential diversity in similarity/dissimilarity metrics. It remains a surprisingly open problem in ensemble clustering how to create and aggregate a large population of diversified metrics, and furthermore, how to jointly investigate the multilevel diversity in the large populations of metrics, subspaces, and clusters in a unified framework. To tackle this problem, this article proposes a novel multidiversified ensemble clustering approach. In particular, we create a large number of diversified metrics by randomizing a scaled exponential similarity kernel, which are then coupled with random subspaces to form a large set of metric-subspace pairs. Based on the similarity matrices derived from these metric-subspace pairs, an ensemble of diversified base clusterings can be thereby constructed. Furthermore, an entropy-based criterion is utilized to explore the cluster wise diversity in ensembles, based on which three specific ensemble clustering algorithms are presented by incorporating three types of consensus functions. Extensive experiments are conducted on 30 high-dimensional datasets, including 18 cancer gene expression datasets and 12 image/speech datasets, which demonstrate the superiority of our algorithms over the state of the art. The source code is available at https://github.com/huangdonghere/MDEC.


Assuntos
Benchmarking , Neoplasias , Algoritmos , Análise por Conglomerados , Humanos , Neoplasias/genética , Software
6.
IEEE Trans Pattern Anal Mach Intell ; 44(10): 6074-6093, 2022 10.
Artigo em Inglês | MEDLINE | ID: mdl-34048336

RESUMO

In conventional person re-identification (re-id), the images used for model training in the training probe set and training gallery set are all assumed to be instance-level samples that are manually labeled from raw surveillance video (likely with the assistance of detection) in a frame-by-frame manner. This labeling across multiple non-overlapping camera views from raw video surveillance is expensive and time consuming. To overcome these issues, we consider a weakly supervised person re-id modeling that aims to find the raw video clips where a given target person appears. In our weakly supervised setting, during training, given a sample of a person captured in one camera view, our weakly supervised approach aims to train a re-id model without further instance-level labeling for this person in another camera view. The weak setting refers to matching a target person with an untrimmed gallery video where we only know that the identity appears in the video without the requirement of annotating the identity in any frame of the video during the training procedure. The weakly supervised person re-id is challenging since it not only suffers from the difficulties occurring in conventional person re-id (e.g., visual ambiguity and appearance variations caused by occlusions, pose variations, background clutter, etc.), but more importantly, is also challenged by weakly supervised information because the instance-level labels and the ground-truth locations for person instances (i.e., the ground-truth bounding boxes of person instances) are absent. To solve the weakly supervised person re-id problem, we develop deep graph metric learning (DGML). On the one hand, DGML measures the consistency between intra-video spatial graphs of consecutive frames, where the spatial graph captures neighborhood relationship about the detected person instances in each frame. On the other hand, DGML distinguishes the inter-video spatial graphs captured from different camera views at different sites simultaneously. To further explicitly embed weak supervision into the DGML and solve the weakly supervised person re-id problem, we introduce weakly supervised regularization (WSR), which utilizes multiple weak video-level labels to learn discriminative features by means of a weak identity loss and a cross-video alignment loss. We conduct extensive experiments to demonstrate the feasibility of the weakly supervised person re-id approach and its special cases (e.g., its bag-to-bag extension) and show that the proposed DGML is effective.


Assuntos
Identificação Biométrica , Algoritmos , Identificação Biométrica/métodos , Humanos
7.
IEEE Trans Cybern ; 52(6): 5229-5241, 2022 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-33156800

RESUMO

In recent years, the recommender system has been widely used in online platforms, which can extract useful information from giant volumes of data and recommend suitable items to the user according to user preferences. However, the recommender system usually suffers from sparsity and cold-start problems. Cross-domain recommendation, as a particular example of transfer learning, has been used to solve the aforementioned problems. However, many existing cross-domain recommendation approaches are based on matrix factorization, which can only learn the shallow and linear characteristics of users and items. Therefore, in this article, we propose a novel autoencoder framework with an attention mechanism (AAM) for cross-domain recommendation, which can transfer and fuse information between different domains and make a more accurate rating prediction. The main idea of the proposed framework lies in utilizing autoencoder, multilayer perceptron, and self-attention to extract user and item features, learn the user and item-latent factors, and fuse the user-latent factors from different domains, respectively. In addition, to learn the affinity of the user-latent factors between different domains in a multiaspect level, we also strengthen the self-attention mechanism by using multihead self-attention and propose AAM++. Experiments conducted on two real-world datasets empirically demonstrate that our proposed methods outperform the state-of-the-art methods in cross-domain recommendation and AAM++ performs better than AAM on sparse and large-scale datasets.


Assuntos
Aprendizagem , Redes Neurais de Computação
8.
IEEE Trans Pattern Anal Mach Intell ; 44(7): 3386-3403, 2022 07.
Artigo em Inglês | MEDLINE | ID: mdl-33571087

RESUMO

Despite the remarkable progress achieved in conventional instance segmentation, the problem of predicting instance segmentation results for unobserved future frames remains challenging due to the unobservability of future data. Existing methods mainly address this challenge by forecasting features of future frames. However, these methods always treat features of multiple levels (e.g., coarse-to-fine pyramid features) independently and do not exploit them collaboratively, which results in inaccurate prediction for future frames; and moreover, such a weakness can partially hinder self-adaption of a future segmentation prediction model for different input samples. To solve this problem, we propose an adaptive aggregation approach called Auto-Path Aggregation Network (APANet), where the spatio-temporal contextual information obtained in the features of each individual level is selectively aggregated using the developed "auto-path". The "auto-path" connects each pair of features extracted at different pyramid levels for task-specific hierarchical contextual information aggregation, which enables selective and adaptive aggregation of pyramid features in accordance with different videos/frames. Our APANet can be further optimized jointly with the Mask R-CNN head as a feature decoder and a Feature Pyramid Network (FPN) feature encoder, forming a joint learning system for future instance segmentation prediction. We experimentally show that the proposed method can achieve state-of-the-art performance on three video-based instance segmentation benchmarks for future instance segmentation prediction.


Assuntos
Processamento de Imagem Assistida por Computador , Redes Neurais de Computação , Algoritmos , Processamento de Imagem Assistida por Computador/métodos , Aprendizagem
9.
IEEE Trans Neural Netw Learn Syst ; 33(11): 6726-6736, 2022 11.
Artigo em Inglês | MEDLINE | ID: mdl-34081589

RESUMO

To alleviate the sparsity issue, many recommender systems have been proposed to consider the review text as the auxiliary information to improve the recommendation quality. Despite success, they only use the ratings as the ground truth for error backpropagation. However, the rating information can only indicate the users' overall preference for the items, while the review text contains rich information about the users' preferences and the attributes of the items. In real life, reviews with the same rating may have completely opposite semantic information. If only the ratings are used for error backpropagation, the latent factors of these reviews will tend to be consistent, resulting in the loss of a large amount of review information. In this article, we propose a novel deep model termed deep rating and review neural network (DRRNN) for recommendation. Specifically, compared with the existing models that adopt the review text as the auxiliary information, DRRNN additionally considers both the target rating and target review of the given user-item pair as ground truth for error backpropagation in the training stage. Therefore, we can keep more semantic information of the reviews while making rating predictions. Extensive experiments on four publicly available datasets demonstrate the effectiveness of the proposed DRRNN model in terms of rating prediction.


Assuntos
Redes Neurais de Computação , Semântica
10.
Artigo em Inglês | MEDLINE | ID: mdl-35839201

RESUMO

As a challenging problem, incomplete multi-view clustering (MVC) has drawn much attention in recent years. Most of the existing methods contain the feature recovering step inevitably to obtain the clustering result of incomplete multi-view datasets. The extra target of recovering the missing feature in the original data space or common subspace is difficult for unsupervised clustering tasks and could accumulate mistakes during the optimization. Moreover, the biased error is not taken into consideration in the previous graph-based methods. The biased error represents the unexpected change of incomplete graph structure, such as the increase in the intra-class relation density and the missing local graph structure of boundary instances. It would mislead those graph-based methods and degrade their final performance. In order to overcome these drawbacks, we propose a new graph-based method named Graph Structure Refining for Incomplete MVC (GSRIMC). GSRIMC avoids recovering feature steps and just fully explores the existing subgraphs of each view to produce superior clustering results. To handle the biased error, the biased error separation is the core step of GSRIMC. In detail, GSRIMC first extracts basic information from the precomputed subgraph of each view and then separates refined graph structure from biased error with the help of tensor nuclear norm. Besides, cross-view graph learning is proposed to capture the missing local graph structure and complete the refined graph structure based on the complementary principle. Extensive experiments show that our method achieves better performance than other state-of-the-art baselines.

11.
IEEE Trans Neural Netw Learn Syst ; 32(11): 5047-5060, 2021 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-33027007

RESUMO

Multiview subspace clustering has attracted an increasing amount of attention in recent years. However, most of the existing multiview subspace clustering methods assume linear relations between multiview data points when learning the affinity representation by means of the self-expression or fail to preserve the locality property of the original feature space in the learned affinity representation. To address the above issues, in this article, we propose a new multiview subspace clustering method termed smoothness regularized multiview subspace clustering with kernel learning (SMSCK). To capture the nonlinear relations between multiview data points, the proposed model maps the concatenated multiview observations into a high-dimensional kernel space, in which the linear relations reflect the nonlinear relations between multiview data points in the original space. In addition, to explicitly preserve the locality property of the original feature space in the learned affinity representation, the smoothness regularization is deployed in the subspace learning in the kernel space. Theoretical analysis has been provided to ensure that the optimal solution of the proposed model meets the grouping effect. The unique optimal solution of the proposed model can be obtained by an optimization strategy and the theoretical convergence analysis is also conducted. Extensive experiments are conducted on both image and document data sets, and the comparison results with state-of-the-art methods demonstrate the effectiveness of our method.

12.
IEEE Trans Cybern ; 49(7): 2678-2692, 2019 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-29994495

RESUMO

Collaborative filtering (CF) algorithms have been widely used to build recommender systems since they have distinguishing capability of sharing collective wisdoms and experiences. However, they may easily fall into the trap of the Matthew effect, which tends to recommend popular items and hence less popular items become increasingly less popular. Under this circumstance, most of the items in the recommendation list are already familiar to users and therefore the performance would seriously degenerate in finding cold items, i.e., new items and niche items. To address this issue, in this paper, a user survey is first conducted on the online shopping habits in China, based on which a novel recommendation algorithm termed innovator-based CF is proposed that can recommend cold items to users by introducing the concept of innovators. Specifically, innovators are a special subset of users who can discover cold items without the help of recommender system. Therefore, cold items can be captured in the recommendation list via innovators, achieving the balance between serendipity and accuracy. To confirm the effectiveness of our algorithm, extensive experiments are conducted on the dataset provided by Alibaba Group in Ali Mobile Recommendation Algorithm Competition, which is collected from the real e-commerce environment and covers massive user behavior log data.

13.
IEEE Trans Cybern ; 48(5): 1460-1473, 2018 May.
Artigo em Inglês | MEDLINE | ID: mdl-28541232

RESUMO

Due to its ability to combine multiple base clusterings into a probably better and more robust clustering, the ensemble clustering technique has been attracting increasing attention in recent years. Despite the significant success, one limitation to most of the existing ensemble clustering methods is that they generally treat all base clusterings equally regardless of their reliability, which makes them vulnerable to low-quality base clusterings. Although some efforts have been made to (globally) evaluate and weight the base clusterings, yet these methods tend to view each base clustering as an individual and neglect the local diversity of clusters inside the same base clustering. It remains an open problem how to evaluate the reliability of clusters and exploit the local diversity in the ensemble to enhance the consensus performance, especially, in the case when there is no access to data features or specific assumptions on data distribution. To address this, in this paper, we propose a novel ensemble clustering approach based on ensemble-driven cluster uncertainty estimation and local weighting strategy. In particular, the uncertainty of each cluster is estimated by considering the cluster labels in the entire ensemble via an entropic criterion. A novel ensemble-driven cluster validity measure is introduced, and a locally weighted co-association matrix is presented to serve as a summary for the ensemble of diverse clusters. With the local diversity in ensembles exploited, two novel consensus functions are further proposed. Extensive experiments on a variety of real-world datasets demonstrate the superiority of the proposed approach over the state-of-the-art.

14.
IEEE Trans Pattern Anal Mach Intell ; 40(2): 392-408, 2018 02.
Artigo em Inglês | MEDLINE | ID: mdl-28207383

RESUMO

The challenge of person re-identification (re-id) is to match individual images of the same person captured by different non-overlapping camera views against significant and unknown cross-view feature distortion. While a large number of distance metric/subspace learning models have been developed for re-id, the cross-view transformations they learned are view-generic and thus potentially less effective in quantifying the feature distortion inherent to each camera view. Learning view-specific feature transformations for re-id (i.e., view-specific re-id), an under-studied approach, becomes an alternative resort for this problem. In this work, we formulate a novel view-specific person re-identification framework from the feature augmentation point of view, called Camera coR relation Aware Feature augmenTation (CRAFT). Specifically, CRAFT performs cross-view adaptation by automatically measuring camera correlation from cross-view visual data distribution and adaptively conducting feature augmentation to transform the original features into a new adaptive space. Through our augmentation framework, view-generic learning algorithms can be readily generalized to learn and optimize view-specific sub-models whilst simultaneously modelling view-generic discrimination information. Therefore, our framework not only inherits the strength of view-generic model learning but also provides an effective way to take into account view specific characteristics. Our CRAFT framework can be extended to jointly learn view-specific feature transformations for person re-id across a large network with more than two cameras, a largely under-investigated but realistic re-id setting. Additionally, we present a domain-generic deep person appearance representation which is designed particularly to be towards view invariant for facilitating cross-view adaptation by CRAFT. We conducted extensively comparative experiments to validate the superiority and advantages of our proposed framework over state-of-the-art competitors on contemporary challenging person re-id datasets.

15.
IEEE Trans Syst Man Cybern B Cybern ; 37(4): 847-62, 2007 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-17702284

RESUMO

This paper addresses the problem of automatically tuning multiple kernel parameters for the kernel-based linear discriminant analysis (LDA) method. The kernel approach has been proposed to solve face recognition problems under complex distribution by mapping the input space to a high-dimensional feature space. Some recognition algorithms such as the kernel principal components analysis, kernel Fisher discriminant, generalized discriminant analysis, and kernel direct LDA have been developed in the last five years. The experimental results show that the kernel-based method is a good and feasible approach to tackle the pose and illumination variations. One of the crucial factors in the kernel approach is the selection of kernel parameters, which highly affects the generalization capability and stability of the kernel-based learning methods. In view of this, we propose an eigenvalue-stability-bounded margin maximization (ESBMM) algorithm to automatically tune the multiple parameters of the Gaussian radial basis function kernel for the kernel subspace LDA (KSLDA) method, which is developed based on our previously developed subspace LDA method. The ESBMM algorithm improves the generalization capability of the kernel-based LDA method by maximizing the margin maximization criterion while maintaining the eigenvalue stability of the kernel-based LDA method. An in-depth investigation on the generalization performance on pose and illumination dimensions is performed using the YaleB and CMU PIE databases. The FERET database is also used for benchmark evaluation. Compared with the existing PCA-based and LDA-based methods, our proposed KSLDA method, with the ESBMM kernel parameter estimation algorithm, gives superior performance.


Assuntos
Inteligência Artificial , Biometria/métodos , Face/anatomia & histologia , Interpretação de Imagem Assistida por Computador/métodos , Iluminação , Reconhecimento Automatizado de Padrão/métodos , Postura , Algoritmos , Simulação por Computador , Análise Discriminante , Humanos
16.
IEEE Trans Image Process ; 26(6): 2588-2603, 2017 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-28252397

RESUMO

Person re-identification (re-id) aims to match people across non-overlapping camera views. So far the RGB-based appearance is widely used in most existing works. However, when people appeared in extreme illumination or changed clothes, the RGB appearance-based re-id methods tended to fail. To overcome this problem, we propose to exploit depth information to provide more invariant body shape and skeleton information regardless of illumination and color change. More specifically, we exploit depth voxel covariance descriptor and further propose a locally rotation invariant depth shape descriptor called Eigen-depth feature to describe pedestrian body shape. We prove that the distance between any two covariance matrices on the Riemannian manifold is equivalent to the Euclidean distance between the corresponding Eigen-depth features. Furthermore, we propose a kernelized implicit feature transfer scheme to estimate Eigen-depth feature implicitly from RGB image when depth information is not available. We find that combining the estimated depth features with RGB-based appearance features can sometimes help to better reduce visual ambiguities of appearance features caused by illumination and similar clothes. The effectiveness of our models was validated on publicly available depth pedestrian datasets as compared to related methods for re-id.

17.
Gene ; 586(1): 148-57, 2016 Jul 15.
Artigo em Inglês | MEDLINE | ID: mdl-27080954

RESUMO

Measuring the similarity between pairs of biological entities is important in molecular biology. The introduction of Gene Ontology (GO) provides us with a promising approach to quantifying the semantic similarity between two genes or gene products. This kind of similarity measure is closely associated with the GO terms annotated to biological entities under consideration and the structure of the GO graph. However, previous works in this field mainly focused on the upper part of the graph, and seldom concerned about the lower part. In this study, we aim to explore information from the lower part of the GO graph for better semantic similarity. We proposed a framework to quantify the similarity measure beneath a term pair, which takes into account both the information two ancestral terms share and the probability that they co-occur with their common descendants. The effectiveness of our approach was evaluated against seven typical measurements on public platform CESSM, protein-protein interaction and gene expression datasets. Experimental results consistently show that the similarity derived from the lower part contributes to better semantic similarity measure. The promising features of our approach are the following: (1) it provides a mirror model to characterize the information two ancestral terms share with respect to their common descendant; (2) it quantifies the probability that two terms co-occur with their common descendant in an efficient way; and (3) our framework can effectively capture the similarity measure beneath two terms, which can serve as an add-on to improve traditional semantic similarity measure between two GO terms. The algorithm was implemented in Matlab and is freely available from http://ejl.org.cn/bio/GOBeneath/.


Assuntos
Ontologia Genética , Semântica , Leveduras/genética , Algoritmos , Bases de Dados de Proteínas , Perfilação da Expressão Gênica , Humanos , Proteínas/metabolismo , Terminologia como Assunto
18.
IEEE Trans Image Process ; 25(5): 2353-67, 2016 May.
Artigo em Inglês | MEDLINE | ID: mdl-27019494

RESUMO

This paper proposes a novel approach to person re-identification, a fundamental task in distributed multi-camera surveillance systems. Although a variety of powerful algorithms have been presented in the past few years, most of them usually focus on designing hand-crafted features and learning metrics either individually or sequentially. Different from previous works, we formulate a unified deep ranking framework that jointly tackles both of these key components to maximize their strengths. We start from the principle that the correct match of the probe image should be positioned in the top rank within the whole gallery set. An effective learning-to-rank algorithm is proposed to minimize the cost corresponding to the ranking disorders of the gallery. The ranking model is solved with a deep convolutional neural network (CNN) that builds the relation between input image pairs and their similarity scores through joint representation learning directly from raw image pixels. The proposed framework allows us to get rid of feature engineering and does not rely on any assumption. An extensive comparative evaluation is given, demonstrating that our approach significantly outperforms all the state-of-the-art approaches, including both traditional and CNN-based methods on the challenging VIPeR, CUHK-01, and CAVIAR4REID datasets. In addition, our approach has better ability to generalize across datasets without fine-tuning.


Assuntos
Algoritmos , Identificação Biométrica/métodos , Bases de Dados Factuais , Humanos , Aprendizado de Máquina , Redes Neurais de Computação
19.
PLoS One ; 11(2): e0147944, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-26828803

RESUMO

The user-based collaborative filtering (CF) algorithm is one of the most popular approaches for making recommendation. Despite its success, the traditional user-based CF algorithm suffers one serious problem that it only measures the influence between two users based on their symmetric similarities calculated by their consumption histories. It means that, for a pair of users, the influences on each other are the same, which however may not be true. Intuitively, an expert may have an impact on a novice user but a novice user may not affect an expert at all. Besides, each user may possess a global importance factor that affects his/her influence to the remaining users. To this end, in this paper, we propose an asymmetric user influence model to measure the directed influence between two users and adopt the PageRank algorithm to calculate the global importance value of each user. And then the directed influence values and the global importance values are integrated to deduce the final influence values between two users. Finally, we use the final influence values to improve the performance of the traditional user-based CF algorithm. Extensive experiments have been conducted, the results of which have confirmed that both the asymmetric user influence model and global importance value play key roles in improving recommendation accuracy, and hence the proposed method significantly outperforms the existing recommendation algorithms, in particular the user-based CF algorithm on the datasets of high rating density.


Assuntos
Algoritmos , Bases de Dados como Assunto , Humanos , Análise de Componente Principal
20.
IEEE Trans Syst Man Cybern B Cybern ; 35(5): 1065-78, 2005 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-16240780

RESUMO

This paper addresses the dimension reduction problem in Fisherface for face recognition. When the number of training samples is less than the image dimension (total number of pixels), the within-class scatter matrix (Sw) in Linear Discriminant Analysis (LDA) is singular, and Principal Component Analysis (PCA) is suggested to employ in Fisherface for dimension reduction of Sw so that it becomes nonsingular. The popular method is to select the largest nonzero eigenvalues and the corresponding eigenvectors for LDA. To attenuate the illumination effect, some researchers suggested removing the three eigenvectors with the largest eigenvalues and the performance is improved. However, as far as we know, there is no systematic way to determine which eigenvalues should be used. Along this line, this paper proposes a theorem to interpret why PCA can be used in LDA and an automatic and systematic method to select the eigenvectors to be used in LDA using a Genetic Algorithm (GA). A GA-PCA is then developed. It is found that some small eigenvectors should also be used as part of the basis for dimension reduction. Using the GA-PCA to reduce the dimension, a GA-Fisher method is designed and developed. Comparing with the traditional Fisherface method, the proposed GA-Fisher offers two additional advantages. First, optimal bases for dimensionality reduction are derived from GA-PCA. Second, the computational efficiency of LDA is improved by adding a whitening procedure after dimension reduction. The Face Recognition Technology (FERET) and Carnegie Mellon University Pose, Illumination, and Expression (CMU PIE) databases are used for evaluation. Experimental results show that almost 5 % improvement compared with Fisherface can be obtained, and the results are encouraging.


Assuntos
Algoritmos , Inteligência Artificial , Face/anatomia & histologia , Interpretação de Imagem Assistida por Computador/métodos , Modelos Lineares , Modelos Biológicos , Reconhecimento Automatizado de Padrão/métodos , Simulação por Computador , Análise Discriminante , Humanos , Aumento da Imagem/métodos , Armazenamento e Recuperação da Informação/métodos , Análise de Componente Principal , Técnica de Subtração
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA