Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 40
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
J Vis ; 13(4)2013 Mar 11.
Artículo en Inglés | MEDLINE | ID: mdl-23479475

RESUMEN

The human visual system possesses the remarkable ability to pick out salient objects in images. Even more impressive is its ability to do the very same in the presence of disturbances. In particular, the ability persists despite the presence of noise, poor weather, and other impediments to perfect vision. Meanwhile, noise can significantly degrade the accuracy of automated computational saliency detection algorithms. In this article, we set out to remedy this shortcoming. Existing computational saliency models generally assume that the given image is clean, and a fundamental and explicit treatment of saliency in noisy images is missing from the literature. Here we propose a novel and statistically sound method for estimating saliency based on a nonparametric regression framework and investigate the stability of saliency models for noisy images and analyze how state-of-the-art computational models respond to noisy visual stimuli. The proposed model of saliency at a pixel of interest is a data-dependent weighted average of dissimilarities between a center patch around that pixel and other patches. To further enhance the degree of accuracy in predicting the human fixations and of stability to noise, we incorporate a global and multiscale approach by extending the local analysis window to the entire input image, even further to multiple scaled copies of the image. Our method consistently outperforms six other state-of-the-art models (Bruce & Tsotsos, 2009; Garcia-Diaz, Fdez-Vidal, Pardo, & Dosil, 2012; Goferman, Zelnik-Manor, & Tal, 2010; Hou & Zhang, 2007; Seo & Milanfar, 2009; Zhang, Tong, & Marks, 2008) for both noise-free and noisy cases.


Asunto(s)
Modelos Biológicos , Enmascaramiento Perceptual/fisiología , Tiempo de Reacción/fisiología , Percepción Visual/fisiología , Fijación Ocular/fisiología , Humanos , Análisis de Regresión , Umbral Sensorial/fisiología
2.
Artículo en Inglés | MEDLINE | ID: mdl-37030810

RESUMEN

Video watermarking embeds a message into a cover video in an imperceptible manner, which can be retrieved even if the video undergoes certain modifications or distortions. Traditional watermarking methods are often manually designed for particular types of distortions and thus cannot simultaneously handle a broad spectrum of distortions. To this end, we propose a robust deep learning-based solution for video watermarking that is end-to-end trainable. Our model consists of a novel multiscale design where the watermarks are distributed across multiple spatial-temporal scales. Extensive evaluations on a wide variety of distortions show that our method outperforms traditional video watermarking methods as well as deep image watermarking models by a large margin. We further demonstrate the practicality of our method on a realistic video-editing application.

3.
IEEE Trans Image Process ; 30: 5944-5955, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34166193

RESUMEN

This work considers noise removal from images, focusing on the well-known K-SVD denoising algorithm. This sparsity-based method was proposed in 2006, and for a short while it was considered as state-of-the-art. However, over the years it has been surpassed by other methods, including the recent deep-learning-based newcomers. The question we address in this paper is whether K-SVD was brought to its peak in its original conception, or whether it can be made competitive again. The approach we take in answering this question is to redesign the algorithm to operate in a supervised manner. More specifically, we propose an end-to-end deep architecture with the exact K-SVD computational path, and train it for optimized denoising. Our work shows how to overcome difficulties arising in turning the K-SVD scheme into a differentiable, and thus learnable, machine. With a small number of parameters to learn and while preserving the original K-SVD essence, the proposed architecture is shown to outperform the classical K-SVD algorithm substantially, and getting closer to recent state-of-the-art learning-based denoising methods. Adopting a broader context, this work touches on themes around the design of deep-learning solutions for image processing tasks, while paving a bridge between classic methods and novel deep-learning-based ones.

4.
Annu Rev Vis Sci ; 7: 571-604, 2021 09 15.
Artículo en Inglés | MEDLINE | ID: mdl-34524880

RESUMEN

The first mobile camera phone was sold only 20 years ago, when taking pictures with one's phone was an oddity, and sharing pictures online was unheard of. Today, the smartphone is more camera than phone. How did this happen? This transformation was enabled by advances in computational photography-the science and engineering of making great images from small-form-factor, mobile cameras. Modern algorithmic and computing advances, including machine learning, have changed the rules of photography, bringing to it new modes of capture, postprocessing, storage, and sharing. In this review, we give a brief history of mobile computational photography and describe some of the key technological components, including burst photography, noise reduction, and super-resolution. At each step, we can draw naive parallels to the human visual system.


Asunto(s)
Teléfono Celular , Fotograbar , Humanos , Teléfono Inteligente
5.
IEEE Trans Image Process ; 30: 6673-6685, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34264828

RESUMEN

Could we compress images via standard codecs while avoiding visible artifacts? The answer is obvious - this is doable as long as the bit budget is generous enough. What if the allocated bit-rate for compression is insufficient? Then unfortunately, artifacts are a fact of life. Many attempts were made over the years to fight this phenomenon, with various degrees of success. In this work we aim to break the unholy connection between bit-rate and image quality, and propose a way to circumvent compression artifacts by pre-editing the incoming image and modifying its content to fit the given bits. We design this editing operation as a learned convolutional neural network, and formulate an optimization problem for its training. Our loss takes into account a proximity between the original image and the edited one, a bit-budget penalty over the proposed image, and a no-reference image quality measure for forcing the outcome to be visually pleasing. The proposed approach is demonstrated on the popular JPEG compression, showing savings in bits and/or improvements in visual quality, obtained with intricate editing effects.

6.
IEEE Trans Image Process ; 18(7): 1438-51, 2009 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-19447711

RESUMEN

In this paper, we propose K-LLD: a patch-based, locally adaptive denoising method based on clustering the given noisy image into regions of similar geometric structure. In order to effectively perform such clustering, we employ as features the local weight functions derived from our earlier work on steering kernel regression . These weights are exceedingly informative and robust in conveying reliable local structural information about the image even in the presence of significant amounts of noise. Next, we model each region (or cluster)-which may not be spatially contiguous-by "learning" a best basis describing the patches within that cluster using principal components analysis. This learned basis (or "dictionary") is then employed to optimally estimate the underlying pixel values using a kernel regression framework. An iterated version of the proposed algorithm is also presented which leads to further performance enhancements. We also introduce a novel mechanism for optimally choosing the local patch size for each cluster using Stein's unbiased risk estimator (SURE). We illustrate the overall algorithm's capabilities with several examples. These indicate that the proposed method appears to be competitive with some of the most recently published state of the art denoising methods.

7.
IEEE Trans Image Process ; 18(1): 36-51, 2009 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-19095517

RESUMEN

Super-resolution reconstruction proposes a fusion of several low-quality images into one higher quality result with better optical resolution. Classic super-resolution techniques strongly rely on the availability of accurate motion estimation for this fusion task. When the motion is estimated inaccurately, as often happens for nonglobal motion fields, annoying artifacts appear in the super-resolved outcome. Encouraged by recent developments on the video denoising problem, where state-of-the-art algorithms are formed with no explicit motion estimation, we seek a super-resolution algorithm of similar nature that will allow processing sequences with general motion patterns. In this paper, we base our solution on the Nonlocal-Means (NLM) algorithm. We show how this denoising method is generalized to become a relatively simple super-resolution algorithm with no explicit motion estimation. Results on several test movies show that the proposed method is very successful in providing super-resolution on general sequences.


Asunto(s)
Algoritmos , Inteligencia Artificial , Aumento de la Imagen/métodos , Interpretación de Imagen Asistida por Computador/métodos , Reconocimiento de Normas Patrones Automatizadas/métodos , Reproducibilidad de los Resultados , Sensibilidad y Especificidad
8.
IEEE Trans Image Process ; 18(9): 1958-75, 2009 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-19473940

RESUMEN

The need for precise (subpixel accuracy) motion estimates in conventional super-resolution has limited its applicability to only video sequences with relatively simple motions such as global translational or affine displacements. In this paper, we introduce a novel framework for adaptive enhancement and spatiotemporal upscaling of videos containing complex activities without explicit need for accurate motion estimation. Our approach is based on multidimensional kernel regression, where each pixel in the video sequence is approximated with a 3-D local (Taylor) series, capturing the essential local behavior of its spatiotemporal neighborhood. The coefficients of this series are estimated by solving a local weighted least-squares problem, where the weights are a function of the 3-D space-time orientation in the neighborhood. As this framework is fundamentally based upon the comparison of neighboring pixels in both space and time, it implicitly contains information about the local motion of the pixels across time, therefore rendering unnecessary an explicit computation of motions of modest size. The proposed approach not only significantly widens the applicability of super-resolution methods to a broad variety of video sequences containing complex motions, but also yields improved overall performance. Using several examples, we illustrate that the developed algorithm has super-resolution capabilities that provide improved optical resolution in the output, while being able to work on general input video with essentially arbitrary motion.

9.
J Vis ; 9(12): 15.1-27, 2009 Nov 20.
Artículo en Inglés | MEDLINE | ID: mdl-20053106

RESUMEN

We present a novel unified framework for both static and space-time saliency detection. Our method is a bottom-up approach and computes so-called local regression kernels (i.e., local descriptors) from the given image (or a video), which measure the likeness of a pixel (or voxel) to its surroundings. Visual saliency is then computed using the said "self-resemblance" measure. The framework results in a saliency map where each pixel (or voxel) indicates the statistical likelihood of saliency of a feature matrix given its surrounding feature matrices. As a similarity measure, matrix cosine similarity (a generalization of cosine similarity) is employed. State of the art performance is demonstrated on commonly used human eye fixation data (static scenes (N. Bruce & J. Tsotsos, 2006) and dynamic scenes (L. Itti & P. Baldi, 2006)) and some psychological patterns.


Asunto(s)
Atención , Percepción Espacial , Percepción del Tiempo , Algoritmos , Simulación por Computador , Fijación Ocular , Humanos , Modelos Psicológicos , Movimiento (Física) , Percepción de Movimiento , Estimulación Luminosa/métodos , Sensibilidad y Especificidad , Tiempo , Visión Ocular
10.
Artículo en Inglés | MEDLINE | ID: mdl-30640613

RESUMEN

In this work, we broadly connect kernel-based filtering (e.g. approaches such as the bilateral filter and nonlocal means, but also many more) with general variational formulations of Bayesian regularized least squares, and the related concept of proximal operators. Variational/Bayesian/proximal formulations often result in optimization problems that do not have closed-form solutions, and therefore typically require global iterative solutions. Our main contribution here is to establish how one can approximate the solution of the resulting global optimization problems using locally adaptive filters with specific kernels. Our results are valid for small regularization strength (i.e. weak noise) but the approach is powerful enough to be useful for a wide range of applications because we expose how to derive a "kernelized" solution to these problems that approximates the global solution in one shot, using only local operations. As another side benefit in the reverse direction, given a local data-adaptive filter constructed with a particular choice of kernel, we enable the interpretation of such filters in the variational/Bayesian/proximal framework.

11.
Artículo en Inglés | MEDLINE | ID: mdl-29994025

RESUMEN

Automatically learned quality assessment for images has recently become a hot topic due to its usefulness in a wide variety of applications such as evaluating image capture pipelines, storage techniques and sharing media. Despite the subjective nature of this problem, most existing methods only predict the mean opinion score provided by datasets such as AVA [1] and TID2013 [2]. Our approach differs from others in that we predict the distribution of human opinion scores using a convolutional neural network. Our architecture also has the advantage of being significantly simpler than other methods with comparable performance. Our proposed approach relies on the success (and retraining) of proven, state-of-the-art deep object recognition networks. Our resulting network can be used to not only score images reliably and with high correlation to human perception, but also to assist with adaptation and optimization of photo editing/enhancement algorithms in a photographic pipeline. All this is done without need for a "golden" reference image, consequently allowing for single-image, semantic- and perceptually-aware, no-reference quality assessment.

12.
IEEE Trans Med Imaging ; 37(9): 1978-1988, 2018 09.
Artículo en Inglés | MEDLINE | ID: mdl-29990154

RESUMEN

Optical coherence tomography (OCT) has revolutionized diagnosis and prognosis of ophthalmic diseases by visualization and measurement of retinal layers. To speed up the quantitative analysis of disease biomarkers, an increasing number of automatic segmentation algorithms have been proposed to estimate the boundary locations of retinal layers. While the performance of these algorithms has significantly improved in recent years, a critical question to ask is how far we are from a theoretical limit to OCT segmentation performance. In this paper, we present the Cramèr-Rao lower bounds (CRLBs) for the problem of OCT layer segmentation. In deriving the CRLBs, we address the important problem of defining statistical models that best represent the intensity distribution in each layer of the retina. Additionally, we calculate the bounds under an optimal affine bias, reflecting the use of prior knowledge in many segmentation algorithms. Experiments using in vivo images of human retina from a commercial spectral domain OCT system are presented, showing potential for improvement of automated segmentation accuracy. Our general mathematical model can be easily adapted for virtually any OCT system. Furthermore, the statistical models of signal and noise developed in this paper can be utilized for the future improvements of OCT image denoising, reconstruction, and many other applications.


Asunto(s)
Procesamiento de Imagen Asistido por Computador/métodos , Modelos Estadísticos , Retina/diagnóstico por imagen , Tomografía de Coherencia Óptica/métodos , Algoritmos , Humanos
13.
IEEE Trans Image Process ; 16(3): 774-88, 2007 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-17357736

RESUMEN

In all imaging systems, the forward process introduces undesirable effects that cause the output signal to be a distorted version of the input. A typical example is of course the blur introduced by the aperture. When the input to such systems can be controlled, prewarping techniques can be employed which consist of systematically modifying the input such that it (at least approximately) cancels out (or compensates for) the process losses. In this paper, we focus on the optical proximity correction mask design problem for "optical microlithography," a process similar to photographic printing used for transferring binary circuit patterns onto silicon wafers. We consider the idealized case of an incoherent imaging system and solve an inverse problem which is an approximation of the real-world optical lithography problem. Our algorithm is based on pixel-based mask representation and uses a continuous function formulation. We also employ the regularization framework to control the tone and complexity of the synthesized masks. Finally, we discuss the extension of our framework to coherent and (the more practical) partially coherent imaging systems.


Asunto(s)
Algoritmos , Aumento de la Imagen/métodos , Interpretación de Imagen Asistida por Computador/métodos , Almacenamiento y Recuperación de la Información/métodos , Fotograbar/métodos , Óptica y Fotónica
14.
IEEE Trans Image Process ; 16(2): 349-66, 2007 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-17269630

RESUMEN

In this paper, we make contact with the field of nonparametric statistics and present a development and generalization of tools and results for use in image processing and reconstruction. In particular, we adapt and expand kernel regression ideas for use in image denoising, upscaling, interpolation, fusion, and more. Furthermore, we establish key relationships with some popular existing methods and show how several of these algorithms, including the recently popularized bilateral filter, are special cases of the proposed framework. The resulting algorithms and analyses are amply illustrated with practical examples.


Asunto(s)
Algoritmos , Inteligencia Artificial , Aumento de la Imagen/métodos , Interpretación de Imagen Asistida por Computador/métodos , Almacenamiento y Recuperación de la Información/métodos , Procesamiento de Señales Asistido por Computador , Análisis de Regresión
15.
IEEE Trans Image Process ; 26(5): 2338-2351, 2017 May.
Artículo en Inglés | MEDLINE | ID: mdl-28287968

RESUMEN

Style transfer is a process of migrating a style from a given image to the content of another, synthesizing a new image, which is an artistic mixture of the two. Recent work on this problem adopting convolutional neural-networks (CNN) ignited a renewed interest in this field, due to the very impressive results obtained. There exists an alternative path toward handling the style transfer task, via the generalization of texture synthesis algorithms. This approach has been proposed over the years, but its results are typically less impressive compared with the CNN ones. In this paper, we propose a novel style transfer algorithm that extends the texture synthesis work of Kwatra et al. (2005), while aiming to get stylized images that are closer in quality to the CNN ones. We modify Kwatra's algorithm in several key ways in order to achieve the desired transfer, with emphasis on a consistent way for keeping the content intact in selected regions, while producing hallucinated and rich style in others. The results obtained are visually pleasing and diverse, shown to be competitive with the recent CNN style transfer algorithms. The proposed algorithm is fast and flexible, being able to process any pair of content + style images.

16.
IEEE Trans Image Process ; 26(9): 4229-4242, 2017 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-28541202

RESUMEN

Pedestrian detection in thermal infrared images poses unique challenges because of the low resolution and noisy nature of the image. Here, we propose a mid-level attribute in the form of the multidimensional template, or tensor, using local steering kernel (LSK) as low-level descriptors for detecting pedestrians in far infrared images. LSK is specifically designed to deal with intrinsic image noise and pixel level uncertainty by capturing local image geometry succinctly instead of collecting local orientation statistics (e.g., histograms in histogram of oriented gradients). In order to learn the LSK tensor, we introduce a new image similarity kernel following the popular maximum margin framework of support vector machines facilitating a relatively short and simple training phase for building a rigid pedestrian detector. Tensor representation has several advantages, and indeed, LSK templates allow exact acceleration of the sluggish but de facto sliding window-based detection methodology with multichannel discrete Fourier transform, facilitating very fast and efficient pedestrian localization. The experimental studies on publicly available thermal infrared images justify our proposals and model assumptions. In addition, the proposed work also involves the release of our in-house annotations of pedestrians in more than 17 000 frames of OSU color thermal database for the purpose of sharing with the research community.

17.
IEEE Trans Image Process ; 15(6): 1413-28, 2006 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-16764267

RESUMEN

Recently, there has been a great deal of work developing super-resolution algorithms for combining a set of low-quality images to produce a set of higher quality images. Either explicitly or implicitly, such algorithms must perform the joint task of registering and fusing the low-quality image data. While many such algorithms have been proposed, very little work has addressed the performance bounds for such problems. In this paper, we analyze the performance limits from statistical first principles using Cramér-Rao inequalities. Such analysis offers insight into the fundamental super-resolution performance bottlenecks as they relate to the subproblems of image registration, reconstruction, and image restoration.


Asunto(s)
Algoritmos , Aumento de la Imagen/métodos , Interpretación de Imagen Asistida por Computador/métodos , Imagenología Tridimensional/métodos , Almacenamiento y Recuperación de la Información/métodos , Modelos Estadísticos , Técnica de Sustracción , Simulación por Computador , Análisis Numérico Asistido por Computador , Reproducibilidad de los Resultados , Sensibilidad y Especificidad , Procesamiento de Señales Asistido por Computador
18.
IEEE Trans Image Process ; 15(1): 141-59, 2006 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-16435545

RESUMEN

In the last two decades, two related categories of problems have been studied independently in image restoration literature: super-resolution and demosaicing. A closer look at these problems reveals the relation between them, and, as conventional color digital cameras suffer from both low-spatial resolution and color-filtering, it is reasonable to address them in a unified context. In this paper, we propose a fast and robust hybrid method of super-resolution and demosaicing, based on a maximum a posteron estimation technique by minimizing a multiterm cost function. The L1 norm is used for measuring the difference between the projected estimate of the high-resolution image and each low-resolution image, removing outliers in the data and errors due to possibly inaccurate motion estimation. Bilateral regularization is used for spatially regularizing the luminance component, resulting in sharp edges and forcing interpolation along the edges and not across them. Simultaneously, Tikhonov regularization is used to smooth the chrominance components. Finally, an additional regularization term is used to force similar edge location and orientation in different color channels. We show that the minimization of the total cost function is relatively easy and fast. Experimental results on synthetic and real data sets confirm the effectiveness of our method.


Asunto(s)
Algoritmos , Color , Colorimetría , Aumento de la Imagen/métodos , Interpretación de Imagen Asistida por Computador/métodos , Almacenamiento y Recuperación de la Información/métodos , Técnica de Sustracción , Procesamiento de Señales Asistido por Computador
19.
IEEE Trans Pattern Anal Mach Intell ; 38(3): 546-62, 2016 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-27046497

RESUMEN

One shot, generic object detection involves searching for a single query object in a larger target image. Relevant approaches have benefited from features that typically model the local similarity patterns. In this paper, we combine local similarity (encoded by local descriptors) with a global context (i.e., a graph structure) of pairwise affinities among the local descriptors, embedding the query descriptors into a low dimensional but discriminatory subspace. Unlike principal components that preserve global structure of feature space, we actually seek a linear approximation to the Laplacian eigenmap that permits us a locality preserving embedding of high dimensional region descriptors. Our second contribution is an accelerated but exact computation of matrix cosine similarity as the decision rule for detection, obviating the computationally expensive sliding window search. We leverage the power of Fourier transform combined with integral image to achieve superior runtime efficiency that allows us to test multiple hypotheses (for pose estimation) within a reasonably short time. Our approach to one shot detection is training-free, and experiments on the standard data sets confirm the efficacy of our model. Besides, low computation cost of the proposed (codebook-free) object detector facilitates rather straightforward query detection in large data sets including movie videos.

20.
Sci Rep ; 5: 12303, 2015 Jul 23.
Artículo en Inglés | MEDLINE | ID: mdl-26201867

RESUMEN

The increasing interest in nanoscience in many research fields like physics, chemistry, and biology, including the environmental fate of the produced nano-objects, requires instrumental improvements to address the sub-micrometric analysis challenges. The originality of our approach is to use both the super-resolution concept and multivariate curve resolution (MCR-ALS) algorithm in confocal Raman imaging to surmount its instrumental limits and to characterize chemical components of atmospheric aerosols at the level of the individual particles. We demonstrate the possibility to go beyond the diffraction limit with this algorithmic approach. Indeed, the spatial resolution is improved by 65% to achieve 200 nm for the considered far-field spectrophotometer. A multivariate curve resolution method is then coupled with super-resolution in order to explore the heterogeneous structure of submicron particles for describing physical and chemical processes that may occur in the atmosphere. The proposed methodology provides new tools for sub-micron characterization of heterogeneous samples using far-field (i.e. conventional) Raman imaging spectrometer.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA