Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
IEEE Trans Cybern ; 53(4): 2494-2505, 2023 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-34793316

RESUMO

In recent years, ensemble methods have shown sterling performance and gained popularity in visual tasks. However, the performance of an ensemble is limited by the paucity of diversity among the models. Thus, to enrich the diversity of the ensemble, we present the distillation approach-learning from experts (LFEs). Such method involves a novel knowledge distillation (KD) method that we present, specific expert learning (SEL), which can reduce class selectivity and improve the performance on specific weaker classes and overall accuracy. Through SEL, models can acquire different knowledge from distinct networks with various areas of expertise, and a highly diverse ensemble can be obtained afterward. Our experimental results demonstrate that, on CIFAR-10, the accuracy of the ResNet-32 increases 0.91% with SEL, and that the ensemble trained by SEL increases accuracy by 1.13%. Compared to state-of-the-art approaches, for example, DML only improves accuracy by 0.3% and 1.02% on single ResNet-32 and the ensemble, respectively. Furthermore, our proposed architecture also can be applied to ensemble distillation (ED), which applies KD on the ensemble model. In conclusion, our experimental results show that our proposed SEL not only improves the accuracy of a single classifier but also boosts the diversity of the ensemble model.

2.
IEEE Trans Neural Netw Learn Syst ; 33(9): 4584-4597, 2022 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-33635797

RESUMO

The virtual try-on task is so attractive that it has drawn considerable attention in the field of computer vision. However, presenting the 3-D physical characteristic (e.g., pleat and shadow) based on a 2-D image is very challenging. Although there have been several previous studies on 2-D-based virtual try-on work, most: 1) required user-specified target poses that are not user-friendly and may not be the best for the target clothing and 2) failed to address some problematic cases, including facial details, clothing wrinkles, and body occlusions. To address these two challenges, in this article, we propose an innovative template-free try-on image synthesis (TF-TIS) network. The TF-TIS first synthesizes the target pose according to the user-specified in-shop clothing. Afterward, given an in-shop clothing image, a user image, and a synthesized pose, we propose a novel model for synthesizing a human try-on image with the target clothing in the best fitting pose. The qualitative and quantitative experiments both indicate that the proposed TF-TIS outperforms the state-of-the-art methods, especially for difficult cases.

3.
Artigo em Inglês | MEDLINE | ID: mdl-32286975

RESUMO

Noise causes unpleasant visual effects in low-light image/video enhancement. In this paper, we aim to make the enhancement model and method aware of noise in the whole process. To deal with heavy noise which is not handled in previous methods, we introduce a robust low-light enhancement approach, aiming at well enhancing low-light images/videos and suppressing intensive noise jointly. Our method is based on the proposed Low-Rank Regularized Retinex Model (LR3M), which is the first to inject low-rank prior into a Retinex decomposition process to suppress noise in the reflectance map. Our method estimates a piece-wise smoothed illumination and a noise-suppressed reflectance sequentially, avoiding remaining noise in the illumination and reflectance maps which are usually presented in alternative decomposition methods. After getting the estimated illumination and reflectance, we adjust the illumination layer and generate our enhancement result. Furthermore, we apply our LR3M to video low-light enhancement. We consider inter-frame coherence of illumination maps and find similar patches through reflectance maps of successive frames to form the low-rank prior to make use of temporal correspondence. Our method performs well for a wide variety of images and videos, and achieves better quality both in enhancing and denoising, compared with the state-of-the-art methods.

4.
IEEE Trans Cybern ; 48(1): 423-435, 2018 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-28026799

RESUMO

It is important to extract a clear background for computer vision and augmented reality. Generally, background extraction assumes the existence of a clean background shot through the input sequence, but realistically, situations may violate this assumption such as highway traffic videos. Therefore, our probabilistic model-based method formulates fusion of candidate background patches of the input sequence as a random walk problem and seeks a globally optimal solution based on their temporal and spatial relationship. Furthermore, we also design two quality measures to consider spatial and temporal coherence and contrast distinctness among pixels as background selection basis. A static background should have high temporal coherence among frames, and thus, we improve our fusion precision with a temporal contrast filter and an optical-flow-based motionless patch extractor. Experiments demonstrate that our algorithm can successfully extract artifact-free background images with low computational cost while comparing to state-of-the-art algorithms.

5.
IEEE Trans Cybern ; 48(5): 1647-1659, 2018 May.
Artigo em Inglês | MEDLINE | ID: mdl-28641273

RESUMO

According to the theory of clothing design, the genres of clothes can be recognized based on a set of visually differentiable style elements, which exhibit salient features of visual appearance and reflect high-level fashion styles for better describing clothing genres. Instead of using less-discriminative low-level features or ambiguous keywords to identify clothing genres, we proposed a novel approach for automatically classifying clothing genres based on the visually differentiable style elements. A set of style elements, that are crucial for recognizing specific visual styles of clothing genres, were identified based on the clothing design theory. In addition, the corresponding salient visual features of each style element were identified and formulated with variables that can be computationally derived with various computer vision algorithms. To evaluate the performance of our algorithm, a dataset containing 3250 full-body shots crawled from popular online stores was built. Recognition results show that our proposed algorithms achieved promising overall precision, recall, and -score of 88.76%, 88.53%, and 88.64% for recognizing upperwear genres, and 88.21%, 88.17%, and 88.19% for recognizing lowerwear genres, respectively. The effectiveness of each style element and its visual features on recognizing clothing genres was demonstrated through a set of experiments involving different sets of style elements or features. In summary, our experimental results demonstrate the effectiveness of the proposed method in clothing genre recognition.

6.
Onco Targets Ther ; 8: 2015-22, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26346558

RESUMO

Lung cancer has a poor prognosis when not diagnosed early and unresectable lesions are present. The management of small lung nodules noted on computed tomography scan is controversial due to uncertain tumor characteristics. A conventional computer-aided diagnosis (CAD) scheme requires several image processing and pattern recognition steps to accomplish a quantitative tumor differentiation result. In such an ad hoc image analysis pipeline, every step depends heavily on the performance of the previous step. Accordingly, tuning of classification performance in a conventional CAD scheme is very complicated and arduous. Deep learning techniques, on the other hand, have the intrinsic advantage of an automatic exploitation feature and tuning of performance in a seamless fashion. In this study, we attempted to simplify the image analysis pipeline of conventional CAD with deep learning techniques. Specifically, we introduced models of a deep belief network and a convolutional neural network in the context of nodule classification in computed tomography images. Two baseline methods with feature computing steps were implemented for comparison. The experimental results suggest that deep learning methods could achieve better discriminative results and hold promise in the CAD application domain.

7.
IEEE Trans Cybern ; 45(4): 742-53, 2015 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-25069133

RESUMO

The difficulty of vision-based posture estimation is greatly decreased with the aid of commercial depth camera, such as Microsoft Kinect. However, there is still much to do to bridge the results of human posture estimation and the understanding of human movements. Human movement assessment is an important technique for exercise learning in the field of healthcare. In this paper, we propose an action tutor system which enables the user to interactively retrieve a learning exemplar of the target action movement and to immediately acquire motion instructions while learning it in front of the Kinect. The proposed system is composed of two stages. In the retrieval stage, nonlinear time warping algorithms are designed to retrieve video segments similar to the query movement roughly performed by the user. In the learning stage, the user learns according to the selected video exemplar, and the motion assessment including both static and dynamic differences is presented to the user in a more effective and organized way, helping him/her to perform the action movement correctly. The experiments are conducted on the videos of ten action types, and the results show that the proposed human action descriptor is representative for action video retrieval and the tutor system can effectively help the user while learning action movements.


Assuntos
Actigrafia/instrumentação , Actigrafia/métodos , Atividade Motora/fisiologia , Movimento/fisiologia , Reconhecimento Automatizado de Padrão/métodos , Jogos de Vídeo , Algoritmos , Sistemas Computacionais , Humanos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Transdutores
8.
IEEE Trans Image Process ; 23(3): 1047-59, 2014 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-24474374

RESUMO

Camera-enabled mobile devices are commonly used as interaction platforms for linking the user's virtual and physical worlds in numerous research and commercial applications, such as serving an augmented reality interface for mobile information retrieval. The various application scenarios give rise to a key technique of daily life visual object recognition. On-premise signs (OPSs), a popular form of commercial advertising, are widely used in our living life. The OPSs often exhibit great visual diversity (e.g., appearing in arbitrary size), accompanied with complex environmental conditions (e.g., foreground and background clutter). Observing that such real-world characteristics are lacking in most of the existing image data sets, in this paper, we first proposed an OPS data set, namely OPS-62, in which totally 4649 OPS images of 62 different businesses are collected from Google's Street View. Further, for addressing the problem of real-world OPS learning and recognition, we developed a probabilistic framework based on the distributional clustering, in which we proposed to exploit the distributional information of each visual feature (the distribution of its associated OPS labels) as a reliable selection criterion for building discriminative OPS models. Experiments on the OPS-62 data set demonstrated the outperformance of our approach over the state-of-the-art probabilistic latent semantic analysis models for more accurate recognitions and less false alarms, with a significant 151.28% relative improvement in the average recognition rate. Meanwhile, our approach is simple, linear, and can be executed in a parallel fashion, making it practical and scalable for large-scale multimedia applications.


Assuntos
Algoritmos , Inteligência Artificial , Interpretação de Imagem Assistida por Computador/métodos , Diretórios de Sinalização e Localização , Processamento de Linguagem Natural , Reconhecimento Automatizado de Padrão/métodos , Aumento da Imagem/métodos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA