Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 44
Filtrar
1.
Artículo en Inglés | MEDLINE | ID: mdl-38959150

RESUMEN

Despite acceleration in the use of 3D meshes, it is difficult to find effective mesh quality assessment algorithms that can produce predictions highly correlated with human subjective opinions. Defining mesh quality features is challenging due to the irregular topology of meshes, which are defined on vertices and triangles. To address this, we propose a novel 3D projective structural similarity index ( 3D- PSSIM) for meshes that is robust to differences in mesh topology. We address topological differences between meshes by introducing multi-view and multi-layer projections that can densely represent the mesh textures and geometrical shapes irrespective of mesh topology. It also addresses occlusion problems that occur during projection. We propose visual sensitivity weights that capture the perceptual sensitivity to the degree of mesh surface curvature. 3D- PSSIM computes perceptual quality predictions by aggregating quality-aware features that are computed in multiple projective spaces onto the mesh domain, rather than on 2D spaces. This allows 3D- PSSIM to determine which parts of a mesh surface are distorted by geometric or color impairments. Experimental results show that 3D- PSSIM can predict mesh quality with high correlation against human subjective judgments, across the presence of noise, even when there are large topological differences, outperforming existing mesh quality assessment models.

2.
IEEE Trans Image Process ; 30: 559-571, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-33206603

RESUMEN

Although it is well-known that the negative effects of VR sickness, and the desirable sense of presence are important determinants of a user's immersive VR experience, there remains a lack of definitive research outcomes to enable the creation of methods to predict and/or optimize the trade-offs between them. Most VR sickness assessment (VRSA) and VR presence assessment (VRPA) studies reported to date have utilized simple image patterns as probes, hence their results are difficult to apply to the highly diverse contents encountered in general, real-world VR environments. To help fill this void, we have constructed a large, dedicated VR sickness/presence (VR-SP) database, which contains 100 VR videos with associated human subjective ratings. Using this new resource, we developed a statistical model of spatio-temporal and rotational frame difference maps to predict VR sickness. We also designed an exceptional motion feature, which is expressed as the correlation between an instantaneous change feature and averaged temporal features. By adding additional features (visual activity, content features) to capture the sense of presence, we use the new data resource to explore the relationship between VRSA and VRPA. We also show the aggregate VR-SP model is able to predict VR sickness with an accuracy of 90% and VR presence with an accuracy of 75% using the new VR-SP dataset.

3.
Artículo en Inglés | MEDLINE | ID: mdl-31995494

RESUMEN

Most full-reference image quality assessment (FR-IQA) methods advanced to date have been holistically designed without regard to the type of distortion impairing the image. However, the perception of distortion depends nonlinearly on the distortion type. Here we propose a novel FR-IQA framework that dynamically generates receptive fields responsive to distortion type. Our proposed method-dynamic receptive field generation based image quality assessor (DRF-IQA)-separates the process of FR-IQA into two streams: 1) dynamic error representation and 2) visual sensitivity-based quality pooling. The first stream generates dynamic receptive fields on the input distorted image, implemented by a trained convolutional neural network (CNN), then the generated receptive field profiles are convolved with the distorted and reference images, and differenced to produce spatial error maps. In the second stream, a visual sensitivity map is generated. The visual sensitivity map is used to weight the spatial error map. The experimental results show that the proposed model achieves state-of-the-art prediction accuracy on various open IQA databases.

4.
Artículo en Inglés | MEDLINE | ID: mdl-32324554

RESUMEN

The topics of visual and audio quality assessment (QA) have been widely researched for decades, yet nearly all of this prior work has focused only on single-mode visual or audio signals. However, visual signals rarely are presented without accompanying audio, including heavy-bandwidth video streaming applications. Moreover, the distortions that may separately (or conjointly) afflict the visual and audio signals collectively shape user-perceived quality of experience (QoE). This motivated us to conduct a subjective study of audio and video (A/V) quality, which we then used to compare and develop A/V quality measurement models and algorithms. The new LIVE-SJTU Audio and Video Quality Assessment (A/V-QA) Database includes 336 A/V sequences that were generated from 14 original source contents by applying 24 different A/V distortion combinations on them. We then conducted a subjective A/V quality perception study on the database towards attaining a better understanding of how humans perceive the overall combined quality of A/V signals. We also designed four different families of objective A/V quality prediction models, using a multimodal fusion strategy. The different types of A/V quality models differ in both the unimodal audio and video quality prediction models comprising the direct signal measurements and in the way that the two perceptual signal modes are combined. The objective models are built using both existing state-of-the-art audio and video quality prediction models and some new prediction models, as well as quality-predictive features delivered by a deep neural network. The methods of fusing audio and video quality predictions that are considered include simple product combinations as well as learned mappings. Using the new subjective A/V database as a tool, we validated and tested all of the objective A/V quality prediction models. We will make the database publicly available to facilitate further research.

5.
IEEE Trans Image Process ; 18(1): 90-105, 2009 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-19095521

RESUMEN

Multimedia communication has become one of the main applications in commercial wireless systems. Multimedia sources, mainly consisting of digital images and videos, have high bandwidth requirements. Since bandwidth is a valuable resource, it is important that its use should be optimized for image and video communication. Therefore, interest in developing new joint source-channel coding (JSCC) methods for image and video communication is increasing. Design of any JSCC scheme requires an estimate of the distortion at different source coding rates and under different channel conditions. The common approach to obtain this estimate is via simulations or operational rate-distortion curves. These approaches, however, are computationally intensive and, hence, not feasible for real-time coding and transmission applications. A more feasible approach to estimate distortion is to develop models that predict distortion at different source coding rates and under different channel conditions. Based on this idea, we present a distortion model for estimating the distortion due to quantization and channel errors in MPEG-4 compressed video streams at different source coding rates and channel bit error rates. This model takes into account important aspects of video compression such as transform coding, motion compensation, and variable length coding. Results show that our model estimates distortion within 1.5 dB of actual simulation values in terms of peak-signal-to-noise ratio.


Asunto(s)
Algoritmos , Compresión de Datos/métodos , Aumento de la Imagen/métodos , Interpretación de Imagen Asistida por Computador/métodos , Procesamiento de Señales Asistido por Computador , Grabación en Video/métodos , Simulación por Computador , Modelos Estadísticos , Reproducibilidad de los Resultados , Sensibilidad y Especificidad
6.
IEEE Trans Image Process ; 17(9): 1624-39, 2008 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-18701399

RESUMEN

In this paper, we derive bounds on the structural similarity (SSIM) index as a function of quantization rate for fixed-rate uniform quantization of image discrete cosine transform (DCT) coefficients under the high-rate assumption. The space domain SSIM index is first expressed in terms of the DCT coefficients of the space domain vectors. The transform domain SSIM index is then used to derive bounds on the average SSIM index as a function of quantization rate for uniform, Gaussian, and Laplacian sources. As an illustrative example, uniform quantization of the DCT coefficients of natural images is considered. We show that the SSIM index between the reference and quantized images fall within the bounds for a large set of natural images. Further, we show using a simple example that the proposed bounds could be very useful for rate allocation problems in practical image and video coding applications.


Asunto(s)
Algoritmos , Compresión de Datos/métodos , Aumento de la Imagen/métodos , Interpretación de Imagen Asistida por Computador/métodos , Procesamiento de Señales Asistido por Computador , Reproducibilidad de los Resultados , Sensibilidad y Especificidad
7.
IEEE Trans Image Process ; 17(9): 1672-84, 2008 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-18713673

RESUMEN

Exploiting the quasi-linear relationship between local phase and disparity, phase-differencing registration algorithms provide a fast, powerful means for disparity estimation. Unfortunately, these phase-differencing techniques suffer a significant impediment: phase nonlinearities. In regions of phase nonlinearity, the signals under consideration possess properties that invalidate the use of phase for disparity estimation. This paper uses the amenable properties of Gaussian white noise images to analytically quantify these properties. The improved understanding gained from this analysis enables us to better understand current methodologies for detecting regions of phase instability. Most importantly, we introduce a new, more effective means for identifying these regions based on the second derivative of phase.


Asunto(s)
Algoritmos , Aumento de la Imagen/métodos , Interpretación de Imagen Asistida por Computador/métodos , Imagenología Tridimensional/métodos , Fotogrametría/métodos , Microscopía de Contraste de Fase/métodos , Dinámicas no Lineales , Reproducibilidad de los Resultados , Sensibilidad y Especificidad
8.
IEEE Trans Image Process ; 17(6): 857-72, 2008 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-18482882

RESUMEN

We propose an algorithm for designing linear equalizers that maximize the structural similarity (SSIM) index between the reference and restored signals. The SSIM index has enjoyed considerable application in the evaluation of image processing algorithms. Algorithms, however, have not been designed yet to explicitly optimize for this measure. The design of such an algorithm is nontrivial due to the nonconvex nature of the distortion measure. In this paper, we reformulate the nonconvex problem as a quasi-convex optimization problem, which admits a tractable solution. We compute the optimal solution in near closed form, with complexity of the resulting algorithm comparable to complexity of the linear minimum mean squared error (MMSE) solution, independent of the number of filter taps. To demonstrate the usefulness of the proposed algorithm, it is applied to restore images that have been blurred and corrupted with additive white gaussian noise. As a special case, we consider blur-free image denoising. In each case, its performance is compared to a locally adaptive linear MSE-optimal filter. We show that the images denoised and restored using the SSIM-optimal filter have higher SSIM index, and superior perceptual quality than those restored using the MSE-optimal adaptive linear filter. Through these results, we demonstrate that a) designing image processing algorithms, and, in particular, denoising and restoration-type algorithms, can yield significant gains over existing (in particular, linear MMSE-based) algorithms by optimizing them for perceptual distortion measures, and b) these gains may be obtained without significant increase in the computational complexity of the algorithm.


Asunto(s)
Algoritmos , Aumento de la Imagen/métodos , Interpretación de Imagen Asistida por Computador/métodos , Reconocimiento de Normas Patrones Automatizadas/métodos , Técnica de Sustracción , Simulación por Computador , Modelos Lineales , Análisis Numérico Asistido por Computador , Reproducibilidad de los Resultados , Sensibilidad y Especificidad , Procesamiento de Señales Asistido por Computador
9.
Artículo en Inglés | MEDLINE | ID: mdl-30222561

RESUMEN

The great variations of videographic skills in videography, camera designs, compression and processing protocols, communication and bandwidth environments, and displays leads to an enormous variety of video impairments. Current noreference (NR) video quality models are unable to handle this diversity of distortions. This is true in part because available video quality assessment databases contain very limited content, fixed resolutions, were captured using a small number of camera devices by a few videographers and have been subjected to a modest number of distortions. As such, these databases fail to adequately represent real world videos, which contain very different kinds of content obtained under highly diverse imaging conditions and are subject to authentic, complex and often commingled distortions that are difficult or impossible to simulate. As a result, NR video quality predictors tested on real-world video data often perform poorly. Towards advancing NR video quality prediction, we have constructed a largescale video quality assessment database containing 585 videos of unique content, captured by a large number of users, with wide ranges of levels of complex, authentic distortions. We collected a large number of subjective video quality scores via crowdsourcing. A total of 4776 unique participants took part in the study, yielding more than 205000 opinion scores, resulting in an average of 240 recorded human opinions per video. We demonstrate the value of the new resource, which we call the LIVE Video Quality Challenge Database (LIVE-VQC for short), by conducting a comparison of leading NR video quality predictors on it. This study is the largest video quality assessment study ever conducted along several key dimensions: number of unique contents, capture devices, distortion types and combinations of distortions, study participants, and recorded subjective scores. The database is available for download on this link: http://live.ece.utexas.edu/research/LIVEVQC/index.html.

10.
Artículo en Inglés | MEDLINE | ID: mdl-29994709

RESUMEN

Most prior approaches to the problem of stereoscopic 3D (S3D) visual discomfort prediction (VDP) have focused on the extraction of perceptually meaningful handcrafted features based on models of visual perception and of natural depth statistics. Towards advancing performance on this problem, we have developed a deep learning based VDP model named Deep Visual Discomfort Predictor (DeepVDP). DeepVDP uses a convolutional neural network (CNN) to learn features that are highly predictive of experienced visual discomfort. Since a large amount of reference data is needed to train a CNN, we develop a systematic way of dividing S3D image into local regions defined as patches, and model a patch-based CNN using two sequential training steps. Since it is very difficult to obtain human opinions on each patch, instead a proxy ground-truth label that is generated by an existing S3D visual discomfort prediction algorithm called 3D-VDP is assigned to each patch. These proxy ground-truth labels are used to conduct the first stage of training the CNN. In the second stage, the automatically learned local abstractions are aggregated into global features via a feature aggregation layer. The learned features are iteratively updated via supervised learning on subjective 3D discomfort scores, which serve as ground-truth labels on each S3D image. The patchbased CNN model that has been pretrained on proxy groundtruth labels is subsequently retrained on true global subjective scores. The global S3D visual discomfort scores predicted by the trained DeepVDP model achieve state-of-the-art performance as compared to previous VDP algorithms.

11.
IEEE Trans Image Process ; 16(3): 813-23, 2007 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-17357739

RESUMEN

We cast the problem of corner detection as a corner search process. We develop principles of foveated visual search and automated fixation selection to accomplish the corner search, supplying a case study of both foveated search and foveated feature detection. The result is a new algorithm for finding corners, which is also a corner-based algorithm for aiming computed foveated visual fixations. In the algorithm, long saccades move the fovea to previously unexplored areas of the image, while short saccades improve the accuracy of putative corner locations. The system is tested on two natural scenes. As an interesting comparison study, we compare fixations generated by the algorithm with those of subjects viewing the same images, whose eye movements are being recorded by an eye tracker. The comparison of fixation patterns is made using an information-theoretic measure. Results show that the algorithm is a good locater of corners, but does not correlate particularly well with human visual fixations.


Asunto(s)
Algoritmos , Inteligencia Artificial , Aumento de la Imagen/métodos , Interpretación de Imagen Asistida por Computador/métodos , Reconocimiento de Normas Patrones Automatizadas/métodos , Reproducibilidad de los Resultados , Sensibilidad y Especificidad
12.
IEEE Trans Image Process ; 26(10): 4885-4899, 2017 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-28641258

RESUMEN

Crosstalk is one of the most severe factors affecting the perceived quality of stereoscopic 3D images. It arises from a leakage of light intensity between multiple views, as in auto-stereoscopic displays. Well-known determinants of crosstalk include the co-location contrast and disparity of the left and right images, which have been dealt with in prior studies. However, when a natural stereo image that contains complex naturalistic spatial characteristics is viewed on an auto-stereoscopic display, other factors may also play an important role in the perception of crosstalk. Here, we describe a new way of predicting the perceived severity of crosstalk, which we call the Binocular Perceptual Crosstalk Predictor (BPCP). BPCP uses measurements of three complementary 3D image properties (texture, structural duplication, and binocular summation) in combination with two well-known factors (co-location contrast and disparity) to make predictions of crosstalk on two-view auto-stereoscopic displays. The new BPCP model includes two masking algorithms and a binocular pooling method. We explore a new masking phenomenon that we call duplicated structure masking, which arises from structural correlations between the original and distorted objects. We also utilize an advanced binocular summation model to develop a binocular pooling algorithm. Our experimental results indicate that BPCP achieves high correlations against subjective test results, improving upon those delivered by previous crosstalk prediction models.

13.
IEEE Trans Image Process ; 26(7): 3479-3491, 2017 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-28436873

RESUMEN

The capability to automatically evaluate the quality of long wave infrared (LWIR) and visible light images has the potential to play an important role in determining and controlling the quality of a resulting fused LWIR-visible light image. Extensive work has been conducted on studying the statistics of natural LWIR and visible images. Nonetheless, there has been little work done on analyzing the statistics of fused LWIR and visible images and associated distortions. In this paper, we analyze five multi-resolution-based image fusion methods in regards to several common distortions, including blur, white noise, JPEG compression, and non-uniformity. We study the natural scene statistics of fused images and how they are affected by these kinds of distortions. Furthermore, we conducted a human study on the subjective quality of pristine and degraded fused LWIR-visible images. We used this new database to create an automatic opinion-distortion-unaware fused image quality model and analyzer algorithm. In the human study, 27 subjects evaluated 750 images over five sessions each. We also propose an opinion-aware fused image quality analyzer, whose relative predictions with respect to other state-of-the-art models correlate better with human perceptual evaluations than competing methods. An implementation of the proposed fused image quality measures can be found at https://github.com/ujemd/NSS-of-LWIR-and-Vissible-Images. Also, the new database can be found at http://bit.ly/2noZlbQ.

14.
IEEE Trans Image Process ; 26(8): 3789-3801, 2017 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-28499997

RESUMEN

Conventional stereoscopic 3D (S3D) displays do not provide accommodation depth cues of the 3D image or video contents being viewed. The sense of content depths is thus limited to cues supplied by motion parallax (for 3D video), stereoscopic vergence cues created by presenting left and right views to the respective eyes, and other contextual and perspective depth cues. The absence of accommodation cues can induce two kinds of accommodation vergence mismatches (AVM) at the fixation and peripheral points, which can result in severe visual discomfort. With the aim of alleviating discomfort arising from AVM, we propose a new visual comfort enhancement approach for processing S3D visual signals to deliver a more comfortable 3D viewing experience at the display. This is accomplished via an optimization process whereby a predictive indicator of visual discomfort is minimized, while still aiming to maintain the viewer's sense of 3D presence by performing a suitable parallax shift, and by directed blurring of the signal. Our processing framework is defined on 3D visual coordinates that reflect the nonuniform resolution of retinal sensors and that uses a measure of 3D saliency strength. An appropriate level of blur that corresponds to the degree of parallax shift is found, making it possible to produce synthetic accommodation cues implemented using a perceptively relevant filter. By this method, AVM, the primary contributor to the discomfort felt when viewing S3D images, is reduced. We show via a series of subjective experiments that the proposed approach improves visual comfort while preserving the sense of 3D presence.

15.
IEEE Trans Image Process ; 26(11): 5217-5231, 2017 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-28742036

RESUMEN

HTTP adaptive streaming is being increasingly deployed by network content providers, such as Netflix and YouTube. By dividing video content into data chunks encoded at different bitrates, a client is able to request the appropriate bitrate for the segment to be played next based on the estimated network conditions. However, this can introduce a number of impairments, including compression artifacts and rebuffering events, which can severely impact an end-user's quality of experience (QoE). We have recently created a new video quality database, which simulates a typical video streaming application, using long video sequences and interesting Netflix content. Going beyond previous efforts, the new database contains highly diverse and contemporary content, and it includes the subjective opinions of a sizable number of human subjects regarding the effects on QoE of both rebuffering and compression distortions. We observed that rebuffering is always obvious and unpleasant to subjects, while bitrate changes may be less obvious due to content-related dependencies. Transient bitrate drops were preferable over rebuffering only on low complexity video content, while consistently low bitrates were poorly tolerated. We evaluated different objective video quality assessment algorithms on our database and found that objective video quality models are unreliable for QoE prediction on videos suffering from both rebuffering events and bitrate changes. This implies the need for more general QoE models that take into account objective quality models, rebuffering-aware information, and memory. The publicly available video content as well as metadata for all of the videos in the new database can be found at http://live.ece.utexas.edu/research/LIVE_NFLXStudy/nflx_index.html.

16.
IEEE Trans Image Process ; 15(11): 3440-51, 2006 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-17076403

RESUMEN

Measurement of visual quality is of fundamental importance for numerous image and video processing applications, where the goal of quality assessment (QA) algorithms is to automatically assess the quality of images or videos in agreement with human quality judgments. Over the years, many researchers have taken different approaches to the problem and have contributed significant research in this area and claim to have made progress in their respective domains. It is important to evaluate the performance of these algorithms in a comparative setting and analyze the strengths and weaknesses of these methods. In this paper, we present results of an extensive subjective quality assessment study in which a total of 779 distorted images were evaluated by about two dozen human subjects. The "ground truth" image quality data obtained from about 25,000 individual human quality judgments is used to evaluate the performance of several prominent full-reference image quality assessment algorithms. To the best of our knowledge, apart from video quality studies conducted by the Video Quality Experts Group, the study presented in this paper is the largest subjective image quality study in the literature in terms of number of images, distortion types, and number of human judgments per image. Moreover, we have made the data from the study freely available to the research community. This would allow other researchers to easily report comparative results in the future.


Asunto(s)
Algoritmos , Aumento de la Imagen/métodos , Interpretación de Imagen Asistida por Computador/métodos , Validación de Programas de Computación , Programas Informáticos , Interpretación Estadística de Datos , Almacenamiento y Recuperación de la Información/métodos , Control de Calidad
17.
IEEE Trans Image Process ; 15(6): 1680-9, 2006 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-16764291

RESUMEN

We propose the concept of quality-aware image, in which certain extracted features of the original (high-quality) image are embedded into the image data as invisible hidden messages. When a distorted version of such an image is received, users can decode the hidden messages and use them to provide an objective measure of the quality of the distorted image. To demonstrate the idea, we build a practical quality-aware image encoding, decoding and quality analysis system, which employs: 1) a novel reduced-reference image quality assessment algorithm based on a statistical model of natural images and 2) a previously developed quantization watermarking-based data hiding technique in the wavelet transform domain.


Asunto(s)
Algoritmos , Compresión de Datos/métodos , Aumento de la Imagen/métodos , Interpretación de Imagen Asistida por Computador/métodos , Procesamiento de Señales Asistido por Computador , Control de Calidad
18.
IEEE Trans Image Process ; 25(2): 615-29, 2016 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-26672036

RESUMEN

The human visual system perceives 3D depth following sensing via its binocular optical system, a series of massively parallel processing units, and a feedback system that controls the mechanical dynamics of eye movements and the crystalline lens. The process of accommodation (focusing of the crystalline lens) and binocular vergence is controlled simultaneously and symbiotically via cross-coupled communication between the two critical depth computation modalities. The output responses of these two subsystems, which are induced by oculomotor control, are used in the computation of a clear and stable cyclopean 3D image from the input stimuli. These subsystems operate in smooth synchronicity when one is viewing the natural world; however, conflicting responses can occur when viewing stereoscopic 3D (S3D) content on fixed displays, causing physiological discomfort. If such occurrences could be predicted, then they might also be avoided (by modifying the acquisition process) or ameliorated (by changing the relative scene depth). Toward this end, we have developed a dynamic accommodation and vergence interaction (DAVI) model that successfully predicts visual discomfort on S3D images. The DAVI model is based on the phasic and reflex responses of the fast fusional vergence mechanism. Quantitative models of accommodation and vergence mismatches are used to conduct visual discomfort prediction. Other 3D perceptual elements are included in the proposed method, including sharpness limits imposed by the depth of focus and fusion limits implied by Panum's fusional area. The DAVI predictor is created by training a support vector machine on features derived from the proposed model and on recorded subjective assessment results. The experimental results are shown to produce accurate predictions of experienced visual discomfort.


Asunto(s)
Percepción de Profundidad/fisiología , Movimientos Oculares/fisiología , Imagenología Tridimensional/métodos , Adulto , Humanos , Modelos Biológicos , Modelos Estadísticos , Adulto Joven
19.
IEEE Trans Image Process ; 25(1): 65-79, 2016 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-26540687

RESUMEN

Natural scene statistics (NSSs) provide powerful, perceptually relevant tools that have been successfully used for image quality analysis of visible light images. Since NSS capture statistical regularities that arise from the physical world, they are relevant to long wave infrared (LWIR) images, which differ from visible light images mainly by the wavelengths captured at the imaging sensors. We show that NSS models of bandpass LWIR images are similar to those of visible light images, but with different parameterizations. Using this difference, we exploit the power of NSS to successfully distinguish between LWIR images and visible light images. In addition, we study distortions unique to LWIR and find directional models useful for detecting the halo effect, simple bandpass models useful for detecting hotspots, and combinations of these models useful for measuring the degree of non-uniformity present in many LWIR images. For local distortion identification and measurement, we also describe a method for generating distortion maps using NSS features. To facilitate our evaluation, we analyze the NSS of LWIR images under pristine and distorted conditions, using four databases, each captured with a different IR camera. Predicting human performance for assessing distortion and quality in LWIR images is critical for task efficacy. We find that NSS features improve human targeting task performance prediction. Furthermore, we conducted a human study on the perceptual quality of noise-and blur-distorted LWIR images and create a new blind image quality predictor for IR images.


Asunto(s)
Procesamiento de Imagen Asistido por Computador/métodos , Rayos Infrarrojos , Modelos Estadísticos , Termografía/métodos , Algoritmos , Humanos
20.
IEEE Trans Image Process ; 14(1): 23-35, 2005 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-15646870

RESUMEN

We develop theorems that place limits on the point-wise approximation of the responses of filters, both linear shift invariant (LSI) and linear shift variant (LSV), to input signals and images that are LSV in the following sense: they can be expressed as the outputs of systems with LSV impulse responses, where the shift variance is with respect to the filter scale of a single-prototype fillter. The approximations take the form of LSI approximations to the responses. We develop tight bounds on the approximation errors expressed in terms of filter durations and derivative (Sobolev) norms. Finally, we find application of the developed theory to defoveation of images, deblurring of shift-variant blurs, and shift-variant edge detection.


Asunto(s)
Algoritmos , Inteligencia Artificial , Aumento de la Imagen/métodos , Interpretación de Imagen Asistida por Computador/métodos , Reconocimiento de Normas Patrones Automatizadas/métodos , Procesamiento de Señales Asistido por Computador , Técnica de Sustracción , Gráficos por Computador , Simulación por Computador , Almacenamiento y Recuperación de la Información/métodos , Modelos Estadísticos , Análisis Numérico Asistido por Computador , Reproducibilidad de los Resultados , Sensibilidad y Especificidad
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA