Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Mais filtros








Base de dados
Intervalo de ano de publicação
1.
IEEE Trans Neural Netw Learn Syst ; 32(11): 5241-5246, 2021 11.
Artigo em Inglês | MEDLINE | ID: mdl-33021944

RESUMO

Machine learning (ML) methods are popular in several application areas of multimedia signal processing. However, most existing solutions in the said area, including the popular least squares, rely on penalizing predictions that deviate from the target ground-truth values. In other words, uncertainty in the ground-truth data is simply ignored. As a result, optimization and validation overemphasize a single-target value when, in fact, human subjects themselves did not unanimously agree to it. This leads to an unreasonable scenario where the trained model is not allowed the benefit of the doubt in terms of prediction accuracy. The problem becomes even more significant in the context of more recent human-centric and immersive multimedia systems where user feedback and interaction are influenced by higher degrees of freedom (leading to higher levels of uncertainty in the ground truth). To ameliorate this drawback, we propose an uncertainty aware loss function (referred to as [Formula: see text]) that explicitly accounts for data uncertainty and is useful for both optimization (training) and validation. As examples, we demonstrate the utility of the proposed method for blind estimation of perceptual quality of audiovisual signals, panoramic images, and images affected by camera-induced distortions. The experimental results support the theoretical ideas in terms of reducing prediction errors. The proposed method is also relevant in the context of more recent paradigms, such as crowdsourcing, where larger uncertainty in ground truth is expected.


Assuntos
Aprendizado de Máquina/tendências , Multimídia/tendências , Redes Neurais de Computação , Incerteza , Humanos , Processamento de Imagem Assistida por Computador/métodos , Processamento de Imagem Assistida por Computador/tendências , Análise dos Mínimos Quadrados
2.
IEEE Trans Image Process ; 23(6): 2625-36, 2014 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-24832595

RESUMO

Many saliency detection models for 2D images have been proposed for various multimedia processing applications during the past decades. Currently, the emerging applications of stereoscopic display require new saliency detection models for salient region extraction. Different from saliency detection for 2D images, the depth feature has to be taken into account in saliency detection for stereoscopic images. In this paper, we propose a novel stereoscopic saliency detection framework based on the feature contrast of color, luminance, texture, and depth. Four types of features, namely color, luminance, texture, and depth, are extracted from discrete cosine transform coefficients for feature contrast calculation. A Gaussian model of the spatial distance between image patches is adopted for consideration of local and global contrast calculation. Then, a new fusion method is designed to combine the feature maps to obtain the final saliency map for stereoscopic images. In addition, we adopt the center bias factor and human visual acuity, the important characteristics of the human visual system, to enhance the final saliency map for stereoscopic images. Experimental results on eye tracking databases show the superior performance of the proposed model over other existing methods.


Assuntos
Algoritmos , Interpretação de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Reconhecimento Automatizado de Padrão/métodos , Fotogrametria/métodos , Técnica de Subtração , Inteligência Artificial , Aumento da Imagem/métodos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
3.
IEEE Trans Image Process ; 21(8): 3364-77, 2012 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-22562758

RESUMO

We present a new image quality assessment (IQA) algorithm based on the phase and magnitude of the 2D (twodimensional) Discrete Fourier Transform (DFT). The basic idea is to compare the phase and magnitude of the reference and distorted images to compute the quality score. However, it is well known that the Human Visual Systems (HVSs) sensitivity to different frequency components is not the same. We accommodate this fact via a simple yet effective strategy of nonuniform binning of the frequency components. This process also leads to reduced space representation of the image thereby enabling the reduced-reference (RR) prospects of the proposed scheme. We employ linear regression to integrate the effects of the changes in phase and magnitude. In this way, the required weights are determined via proper training and hence more convincing and effective. Lastly, using the fact that phase usually conveys more information than magnitude, we use only the phase for RR quality assessment. This provides the crucial advantage of further reduction in the required amount of reference image information. The proposed method is therefore further scalable for RR scenarios. We report extensive experimental results using a total of 9 publicly available databases: 7 image (with a total of 3832 distorted images with diverse distortions) and 2 video databases (totally 228 distorted videos). These show that the proposed method is overall better than several of the existing fullreference (FR) algorithms and two RR algorithms. Additionally, there is a graceful degradation in prediction performance as the amount of reference image information is reduced thereby confirming its scalability prospects. To enable comparisons and future study, a Matlab implementation of the proposed algorithm is available at http://www.ntu.edu.sg/home/wslin/reduced_phase.rar.


Assuntos
Algoritmos , Artefatos , Análise de Fourier , Aumento da Imagem/métodos , Interpretação de Imagem Assistida por Computador/métodos , Reconhecimento Automatizado de Padrão/métodos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Processamento de Sinais Assistido por Computador
4.
IEEE Trans Image Process ; 21(4): 1500-12, 2012 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-22106145

RESUMO

In this paper, we propose a new image quality assessment (IQA) scheme, with emphasis on gradient similarity. Gradients convey important visual information and are crucial to scene understanding. Using such information, structural and contrast changes can be effectively captured. Therefore, we use the gradient similarity to measure the change in contrast and structure in images. Apart from the structural/contrast changes, image quality is also affected by luminance changes, which must be also accounted for complete and more robust IQA. Hence, the proposed scheme considers both luminance and contrast-structural changes to effectively assess image quality. Furthermore, the proposed scheme is designed to follow the masking effect and visibility threshold more closely, i.e., the case when both masked and masking signals are small is more effectively tackled by the proposed scheme. Finally, the effects of the changes in luminance and contrast-structure are integrated via an adaptive method to obtain the overall image quality score. Extensive experiments conducted with six publicly available subject-rated databases (comprising of diverse images and distortion types) have confirmed the effectiveness, robustness, and efficiency of the proposed scheme in comparison with the relevant state-of-the-art schemes.


Assuntos
Algoritmos , Artefatos , Aumento da Imagem/métodos , Interpretação de Imagem Assistida por Computador/métodos , Reconhecimento Automatizado de Padrão/métodos , Processamento de Sinais Assistido por Computador , Técnica de Subtração , Controle de Qualidade , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
5.
IEEE Trans Syst Man Cybern B Cybern ; 42(2): 347-64, 2012 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-21965214

RESUMO

We study the use of machine learning for visual quality evaluation with comprehensive singular value decomposition (SVD)-based visual features. In this paper, the two-stage process and the relevant work in the existing visual quality metrics are first introduced followed by an in-depth analysis of SVD for visual quality assessment. Singular values and vectors form the selected features for visual quality assessment. Machine learning is then used for the feature pooling process and demonstrated to be effective. This is to address the limitations of the existing pooling techniques, like simple summation, averaging, Minkowski summation, etc., which tend to be ad hoc. We advocate machine learning for feature pooling because it is more systematic and data driven. The experiments show that the proposed method outperforms the eight existing relevant schemes. Extensive analysis and cross validation are performed with ten publicly available databases (eight for images with a total of 4042 test images and two for video with a total of 228 videos). We use all publicly accessible software and databases in this study, as well as making our own software public, to facilitate comparison in future research.

6.
IEEE Trans Neural Netw ; 21(3): 515-9, 2010 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-20100674

RESUMO

Objective image quality estimation is useful in many visual processing systems, and is difficult to perform in line with the human perception. The challenge lies in formulating effective features and fusing them into a single number to predict the quality score. In this brief, we propose a new approach to address the problem, with the use of singular vectors out of singular value decomposition (SVD) as features for quantifying major structural information in images and then support vector regression (SVR) for automatic prediction of image quality. The feature selection with singular vectors is novel and general for gauging structural changes in images as a good representative of visual quality variations. The use of SVR exploits the advantages of machine learning with the ability to learn complex data patterns for an effective and generalized mapping of features into a desired score, in contrast with the oft-utilized feature pooling process in the existing image quality estimators; this is to overcome the difficulty of model parameter determination for such a system to emulate the related, complex human visual system (HVS) characteristics. Experiments conducted with three independent databases confirm the effectiveness of the proposed system in predicting image quality with better alignment with the HVS's perception than the relevant existing work. The tests with untrained distortions and databases further demonstrate the robustness of the system and the importance of the feature selection.


Assuntos
Inteligência Artificial , Processamento de Imagem Assistida por Computador , Reconhecimento Automatizado de Padrão/métodos , Humanos , Armazenamento e Recuperação da Informação , Reprodutibilidade dos Testes , Vias Visuais/fisiologia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA