Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 19 de 19
Filtrar
1.
Biomed Opt Express ; 13(10): 5447-5467, 2022 Oct 01.
Artículo en Inglés | MEDLINE | ID: mdl-36425622

RESUMEN

Camera-based heart rate measurement is becoming an attractive option as a non-contact modality for continuous remote health and engagement monitoring. However, reliable heart rate extraction from camera-based measurement is challenging in realistic scenarios, especially when the subject is moving. In this work, we develop a motion-robust algorithm, labeled RobustPPG, for extracting photoplethysmography signals (PPG) from face video and estimating the heart rate. Our key innovation is to explicitly model and generate motion distortions due to the movements of the person's face. We use inverse rendering to obtain the 3D shape and albedo of the face and environment lighting from video frames and then render the human face for each frame. The rendered face is similar to the original face but does not contain the heart rate signal; facial movements alone cause pixel intensity variation in the generated video frames. Finally, we use the generated motion distortion to filter the motion-induced measurements. We demonstrate that our approach performs better than the state-of-the-art methods in extracting a clean blood volume signal with over 2 dB signal quality improvement and 30% improvement in RMSE of estimated heart rate in intense motion scenarios.

2.
IEEE Trans Pattern Anal Mach Intell ; 36(2): 248-60, 2014 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-24356347

RESUMEN

Cameras face a fundamental trade-off between spatial and temporal resolution. Digital still cameras can capture images with high spatial resolution, but most high-speed video cameras have relatively low spatial resolution. It is hard to overcome this trade-off without incurring a significant increase in hardware costs. In this paper, we propose techniques for sampling, representing, and reconstructing the space-time volume to overcome this trade-off. Our approach has two important distinctions compared to previous works: 1) We achieve sparse representation of videos by learning an overcomplete dictionary on video patches, and 2) we adhere to practical hardware constraints on sampling schemes imposed by architectures of current image sensors, which means that our sampling function can be implemented on CMOS image sensors with modified control units in the future. We evaluate components of our approach, sampling function and sparse representation, by comparing them to several existing approaches. We also implement a prototype imaging system with pixel-wise coded exposure control using a liquid crystal on silicon device. System characteristics such as field of view and modulation transfer function are evaluated for our imaging system. Both simulations and experiments on a wide range of scenes show that our method can effectively reconstruct a video from a single coded image while maintaining high spatial resolution.


Asunto(s)
Algoritmos , Compresión de Datos/métodos , Aumento de la Imagen/métodos , Fotograbar/métodos , Procesamiento de Señales Asistido por Computador , Grabación en Video/métodos , Aumento de la Imagen/instrumentación , Fotograbar/instrumentación , Reproducibilidad de los Resultados , Tamaño de la Muestra , Sensibilidad y Especificidad , Análisis Espacio-Temporal , Grabación en Video/instrumentación
3.
IEEE Trans Pattern Anal Mach Intell ; 35(3): 555-67, 2013 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-22665721

RESUMEN

We propose a new method named compressive structured light for recovering inhomogeneous participating media. Whereas conventional structured light methods emit coded light patterns onto the surface of an opaque object to establish correspondence for triangulation, compressive structured light projects patterns into a volume of participating medium to produce images which are integral measurements of the volume density along the line of sight. For a typical participating medium encountered in the real world, the integral nature of the acquired images enables the use of compressive sensing techniques that can recover the entire volume density from only a few measurements. This makes the acquisition process more efficient and enables reconstruction of dynamic volumetric phenomena. Moreover, our method requires the projection of multiplexed coded illumination, which has the added advantage of increasing the signal-to-noise ratio of the acquisition. Finally, we propose an iterative algorithm to correct for the attenuation of the participating medium during the reconstruction process. We show the effectiveness of our method with simulations as well as experiments on the volumetric recovery of multiple translucent layers, 3D point clouds etched in glass, and the dynamic process of milk drops dissolving in water.

4.
IEEE Trans Image Process ; 22(2): 447-58, 2013 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-22955907

RESUMEN

A number of computational imaging techniques are introduced to improve image quality by increasing light throughput. These techniques use optical coding to measure a stronger signal level. However, the performance of these techniques is limited by the decoding step, which amplifies noise. Although it is well understood that optical coding can increase performance at low light levels, little is known about the quantitative performance advantage of computational imaging in general settings. In this paper, we derive the performance bounds for various computational imaging techniques. We then discuss the implications of these bounds for several real-world scenarios (e.g., illumination conditions, scene properties, and sensor noise characteristics). Our results show that computational imaging techniques do not provide a significant performance advantage when imaging with illumination that is brighter than typical daylight. These results can be readily used by practitioners to design the most suitable imaging systems given the application at hand.

5.
J Opt Soc Am A Opt Image Sci Vis ; 28(12): 2540-53, 2011 Dec 01.
Artículo en Inglés | MEDLINE | ID: mdl-22193267

RESUMEN

The resolution of a camera system determines the fidelity of visual features in captured images. Higher resolution implies greater fidelity and, thus, greater accuracy when performing automated vision tasks, such as object detection, recognition, and tracking. However, the resolution of any camera is fundamentally limited by geometric aberrations. In the past, it has generally been accepted that the resolution of lenses with geometric aberrations cannot be increased beyond a certain threshold. We derive an analytic scaling law showing that, for lenses with spherical aberrations, resolution can be increased beyond the aberration limit by applying a postcapture deblurring step. We then show that resolution can be further increased when image priors are introduced. Based on our analysis, we advocate for computational camera designs consisting of a spherical lens shared by several small planar sensors. We show example images captured with a proof-of-concept gigapixel camera, demonstrating that high resolution can be achieved with a compact form factor and low complexity. We conclude with an analysis on the trade-off between performance and complexity for computational imaging systems with spherical lenses.

6.
IEEE Trans Image Process ; 20(12): 3322-40, 2011 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-22020681

RESUMEN

A computational camera uses a combination of optics and processing to produce images that cannot be captured with traditional cameras. In the last decade, computational imaging has emerged as a vibrant field of research. A wide variety of computational cameras has been demonstrated to encode more useful visual information in the captured images, as compared with conventional cameras. In this paper, we survey computational cameras from two perspectives. First, we present a taxonomy of computational camera designs according to the coding approaches, including object side coding, pupil plane coding, sensor side coding, illumination coding, camera arrays and clusters, and unconventional imaging systems. Second, we use the abstract notion of light field representation as a general tool to describe computational camera designs, where each camera can be formulated as a projection of a high-dimensional light field to a 2-D image sensor. We show how individual optical devices transform light fields and use these transforms to illustrate how different computational camera designs (collections of optical devices) capture and encode useful visual information.

7.
IEEE Trans Pattern Anal Mach Intell ; 33(10): 1962-77, 2011 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-21383395

RESUMEN

We introduce the use of describable visual attributes for face verification and image search. Describable visual attributes are labels that can be given to an image to describe its appearance. This paper focuses on images of faces and the attributes used to describe them, although the concepts also apply to other domains. Examples of face attributes include gender, age, jaw shape, nose size, etc. The advantages of an attribute-based representation for vision tasks are manifold: They can be composed to create descriptions at various levels of specificity; they are generalizable, as they can be learned once and then applied to recognize new objects or categories without any further training; and they are efficient, possibly requiring exponentially fewer attributes (and training data) than explicitly naming each category. We show how one can create and label large data sets of real-world images to train classifiers which measure the presence, absence, or degree to which an attribute is expressed in images. These classifiers can then automatically label new images. We demonstrate the current effectiveness--and explore the future potential--of using attributes for face verification and image search via human and computational experiments. Finally, we introduce two new face data sets, named FaceTracer and PubFig, with labeled attributes and identities, respectively.


Asunto(s)
Identificación Biométrica/métodos , Cara/anatomía & histología , Procesamiento de Imagen Asistido por Computador/métodos , Algoritmos , Femenino , Humanos , Masculino
8.
IEEE Trans Pattern Anal Mach Intell ; 33(1): 58-71, 2011 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-21088319

RESUMEN

The range of scene depths that appear focused in an image is known as the depth of field (DOF). Conventional cameras are limited by a fundamental trade-off between depth of field and signal-to-noise ratio (SNR). For a dark scene, the aperture of the lens must be opened up to maintain SNR, which causes the DOF to reduce. Also, today's cameras have DOFs that correspond to a single slab that is perpendicular to the optical axis. In this paper, we present an imaging system that enables one to control the DOF in new and powerful ways. Our approach is to vary the position and/or orientation of the image detector during the integration time of a single photograph. Even when the detector motion is very small (tens of microns), a large range of scene depths (several meters) is captured, both in and out of focus. Our prototype camera uses a micro-actuator to translate the detector along the optical axis during image integration. Using this device, we demonstrate four applications of flexible DOF. First, we describe extended DOF where a large depth range is captured with a very wide aperture (low noise) but with nearly depth-independent defocus blur. Deconvolving a captured image with a single blur kernel gives an image with extended DOF and high SNR. Next, we show the capture of images with discontinuous DOFs. For instance, near and far objects can be imaged with sharpness, while objects in between are severely blurred. Third, we show that our camera can capture images with tilted DOFs (Scheimpflug imaging) without tilting the image detector. Finally, we demonstrate how our camera can be used to realize nonplanar DOFs. We believe flexible DOF imaging can open a new creative dimension in photography and lead to new capabilities in scientific imaging, vision, and graphics.


Asunto(s)
Aumento de la Imagen/instrumentación , Aumento de la Imagen/métodos , Procesamiento de Imagen Asistido por Computador/métodos , Fotograbar/instrumentación , Fotograbar/métodos , Algoritmos , Lentes
9.
IEEE Trans Image Process ; 19(9): 2241-53, 2010 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-20350852

RESUMEN

We propose the concept of a generalized assorted pixel (GAP) camera, which enables the user to capture a single image of a scene and, after the fact, control the tradeoff between spatial resolution, dynamic range and spectral detail. The GAP camera uses a complex array (or mosaic) of color filters. A major problem with using such an array is that the captured image is severely under-sampled for at least some of the filter types. This leads to reconstructed images with strong aliasing. We make four contributions in this paper: 1) we present a comprehensive optimization method to arrive at the spatial and spectral layout of the color filter array of a GAP camera. 2) We develop a novel algorithm for reconstructing the under-sampled channels of the image while minimizing aliasing artifacts. 3) We demonstrate how the user can capture a single image and then control the tradeoff of spatial resolution to generate a variety of images, including monochrome, high dynamic range (HDR) monochrome, RGB, HDR RGB, and multispectral images. 4) Finally, the performance of our GAP camera has been verified using extensive simulations that use multispectral images of real world scenes. A large database of these multispectral images has been made available at http://www1.cs.columbia.edu/CAVE/projects/gap_camera/ for use by the research community.

10.
IEEE Trans Pattern Anal Mach Intell ; 29(8): 1339-54, 2007 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-17568139

RESUMEN

Imaging of objects under variable lighting directions is an important and frequent practice in computer vision, machine vision, and image-based rendering. Methods for such imaging have traditionally used only a single light source per acquired image. They may result in images that are too dark and noisy, e.g., due to the need to avoid saturation of highlights. We introduce an approach that can significantly improve the quality of such images, in which multiple light sources illuminate the object simultaneously from different directions. These illumination-multiplexed frames are then computationally demultiplexed. The approach is useful for imaging dim objects, as well as objects having a specular reflection component. We give the optimal scheme by which lighting should be multiplexed to obtain the highest quality output, for signal-independent noise. The scheme is based on Hadamard codes. The consequences of imperfections such as stray light, saturation, and noisy illumination sources are then studied. In addition, the paper analyzes the implications of shot noise, which is signal-dependent, to Hadamard multiplexing. The approach facilitates practical lighting setups having high directional resolution. This is shown by a setup we devise, which is flexible, scalable, and programmable. We used it to demonstrate the benefit of multiplexing in experiments.

11.
IEEE Trans Vis Comput Graph ; 13(3): 595-609, 2007.
Artículo en Inglés | MEDLINE | ID: mdl-17356224

RESUMEN

The properties of virtually all real-world materials change with time, causing their bidirectional reflectance distribution functions (BRDFs) to be time varying. However, none of the existing BRDF models and databases take time variation into consideration; they represent the appearance of a material at a single time instance. In this paper, we address the acquisition, analysis, modeling, and rendering of a wide range of time-varying BRDFs (TVBRDFs). We have developed an acquisition system that is capable of sampling a material's BRDF at multiple time instances, with each time sample acquired within 36 sec. We have used this acquisition system to measure the BRDFs of a wide range of time-varying phenomena, which include the drying of various types of paints (watercolor, spray, and oil), the drying of wet rough surfaces (cement, plaster, and fabrics), the accumulation of dusts (household and joint compound) on surfaces, and the melting of materials (chocolate). Analytic BRDF functions are fit to these measurements and the model parameters' variations with time are analyzed. Each category exhibits interesting and sometimes nonintuitive parameter trends. These parameter trends are then used to develop analytic TVBRDF models. The analytic TVBRDF models enable us to apply effects such as paint drying and dust accumulation to arbitrary surfaces and novel materials.

13.
IEEE Trans Pattern Anal Mach Intell ; 27(10): 1675-9, 2005 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-16238002

RESUMEN

Principal Component Analysis (PCA) is extensively used in computer vision and image processing. Since it provides the optimal linear subspace in a least-square sense, it has been used for dimensionality reduction and subspace analysis in various domains. However, its scalability is very limited because of its inherent computational complexity. We introduce a new framework for applying PCA to visual data which takes advantage of the spatio-temporal correlation and localized frequency variations that are typically found in such data. Instead of applying PCA to the whole volume of data (complete set of images), we partition the volume into a set of blocks and apply PCA to each block. Then, we group the subspaces corresponding to the blocks and merge them together. As a result, we not only achieve greater efficiency in the resulting representation of the visual data, but also successfully scale PCA to handle large data sets. We present a thorough analysis of the computational complexity and storage benefits of our approach. We apply our algorithm to several types of videos. We show that, in addition to its storage and speed benefits, the algorithm results in a useful representation of the visual data.


Asunto(s)
Algoritmos , Inteligencia Artificial , Aumento de la Imagen/métodos , Interpretación de Imagen Asistida por Computador/métodos , Almacenamiento y Recuperación de la Información/métodos , Reconocimiento de Normas Patrones Automatizadas/métodos , Análisis de Componente Principal , Simulación por Computador , Modelos Estadísticos
14.
IEEE Trans Pattern Anal Mach Intell ; 27(6): 977-87, 2005 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-15943428

RESUMEN

Video cameras must produce images at a reasonable frame-rate and with a reasonable depth of field. These requirements impose fundamental physical limits on the spatial resolution of the image detector. As a result, current cameras produce videos with a very low resolution. The resolution of videos can be computationally enhanced by moving the camera and applying super-resolution reconstruction algorithms. However, a moving camera introduces motion blur, which limits super-resolution quality. We analyze this effect and derive a theoretical result showing that motion blur has a substantial degrading effect on the performance of super-resolution. The conclusion is that, in order to achieve the highest resolution, motion blur should be avoided. Motion blur can be minimized by sampling the space-time volume of the video in a specific manner. We have developed a novel camera, called the "jitter camera," that achieves this sampling. By applying an adaptive super-resolution algorithm to the video produced by the jitter camera, we show that resolution can be notably enhanced for stationary or slowly moving objects, while it is improved slightly or left unchanged for objects with fast and complex motions. The end result is a video that has a significantly higher resolution than the captured one.


Asunto(s)
Algoritmos , Inteligencia Artificial , Aumento de la Imagen/instrumentación , Interpretación de Imagen Asistida por Computador/instrumentación , Fotograbar/instrumentación , Procesamiento de Señales Asistido por Computador/instrumentación , Grabación en Video/instrumentación , Artefactos , Diseño de Equipo , Análisis de Falla de Equipo , Estudios de Factibilidad , Aumento de la Imagen/métodos , Interpretación de Imagen Asistida por Computador/métodos , Movimiento , Reconocimiento de Normas Patrones Automatizadas/métodos , Fotograbar/métodos , Proyectos Piloto , Reproducibilidad de los Resultados , Sensibilidad y Especificidad , Grabación en Video/métodos
15.
IEEE Trans Pattern Anal Mach Intell ; 27(4): 518-30, 2005 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-15794158

RESUMEN

Multisampled imaging is a general framework for using pixels on an image detector to simultaneously sample multiple dimensions of imaging (space, time, spectrum, brightness, polarization, etc.). The mosaic of red, green, and blue spectral filters found in most solid-state color cameras is one example of multisampled imaging. We briefly describe how multisampling can be used to explore other dimensions of imaging. Once such an image is captured, smooth reconstructions along the individual dimensions can be obtained using standard interpolation algorithms. Typically, this results in a substantial reduction of resolution (and, hence, image quality). One can extract significantly greater resolution in each dimension by noting that the light fields associated with real scenes have enormous redundancies within them, causing different dimensions to be highly correlated. Hence, multisampled images can be better interpolated using local structural models that are learned offline from a diverse set of training images. The specific type of structural models we use are based on polynomial functions of measured image intensities. They are very effective as well as computationally efficient. We demonstrate the benefits of structural interpolation using three specific applications. These are 1) traditional color imaging with a mosaic of color filters, 2) high dynamic range monochrome imaging using a mosaic of exposure filters, and 3) high dynamic range color imaging using a mosaic of overlapping color and exposure filters.


Asunto(s)
Algoritmos , Inteligencia Artificial , Aumento de la Imagen/métodos , Interpretación de Imagen Asistida por Computador/métodos , Imagenología Tridimensional/métodos , Reconocimiento de Normas Patrones Automatizadas/métodos , Procesamiento de Señales Asistido por Computador , Gráficos por Computador , Almacenamiento y Recuperación de la Información/métodos , Análisis Numérico Asistido por Computador , Reproducibilidad de los Resultados , Sensibilidad y Especificidad , Técnica de Sustracción
16.
IEEE Trans Pattern Anal Mach Intell ; 26(10): 1272-82, 2004 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-15641715

RESUMEN

Many vision applications require precise measurement of scene radiance. The function relating scene radiance to image intensity of an imaging system is called the camera response. We analyze the properties that all camera responses share. This allows us to find the constraints that any response function must satisfy. These constraints determine the theoretical space of all possible camera responses. We have collected a diverse database of real-world camera response functions (DoRF). Using this database, we show that real-world responses occupy a small part of the theoretical space of all possible responses. We combine the constraints from our theoretical space with the data from DoRF to create a low-parameter empirical model of response (EMoR). This response model allows us to accurately interpolate the complete response function of a camera from a small number of measurements obtained using a standard chart. We also show that the model can be used to accurately estimate the camera response from images of an arbitrary scene taken using different exposures. The DoRF database and the EMoR model can be downloaded at http://www.cs.columbia.edu/CAVE.


Asunto(s)
Algoritmos , Inteligencia Artificial , Aumento de la Imagen/métodos , Interpretación de Imagen Asistida por Computador/métodos , Modelos Teóricos , Fotograbar/instrumentación , Fotograbar/métodos , Análisis por Conglomerados , Gráficos por Computador , Simulación por Computador , Análisis de Falla de Equipo/métodos , Imagenología Tridimensional/métodos , Análisis Numérico Asistido por Computador , Reconocimiento de Normas Patrones Automatizadas/métodos , Sensibilidad y Especificidad , Procesamiento de Señales Asistido por Computador
17.
IEEE Trans Pattern Anal Mach Intell ; 26(6): 689-98, 2004 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-18579930

RESUMEN

Motion blur due to camera motion can significantly degrade the quality of an image. Since the path of the camera motion can be arbitrary, deblurring of motion blurred images is a hard problem. Previous methods to deal with this problem have included blind restoration of motion blurred images, optical correction using stabilized lenses, and special cmos sensors that limit the exposure time in the presence of motion. In this paper, we exploit the fundamental trade off between spatial resolution and temporal resolution to construct a hybrid camera that can measure its own motion during image integration. The acquired motion information is used to compute a point spread function (psf) that represents the path of the camera during integration. This psf is then used to deblur the image. To verify the feasibility of hybrid imaging for motion deblurring, we have implemented a prototype hybrid camera. This prototype system was evaluated in different indoor and outdoor scenes using long exposures and complex camera motion paths. The results show that, with minimal resources, hybrid imaging outperforms previous approaches to the motion blur problem. We conclude with a brief discussion on how our ideas can be extended beyond the case of global camera motion to the case where individual objects in the scene move with different velocities.


Asunto(s)
Algoritmos , Artefactos , Inteligencia Artificial , Aumento de la Imagen/métodos , Interpretación de Imagen Asistida por Computador/métodos , Reconocimiento de Normas Patrones Automatizadas/métodos , Fotograbar/métodos , Estudios de Factibilidad , Movimiento (Física) , Reproducibilidad de los Resultados , Sensibilidad y Especificidad
18.
IEEE Trans Pattern Anal Mach Intell ; 26(7): 831-47, 2004 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-18579943

RESUMEN

The histogram of image intensities is used extensively for recognition and for retrieval of images and video from visual databases. A single image histogram, however, suffers from the inability to encode spatial image variation. An obvious way to extend this feature is to compute the histograms of multiple resolutions of an image to form a multiresolution histogram. The multiresolution histogram shares many desirable properties with the plain histogram including that they are both fast to compute, space efficient, invariant to rigid motions, and robust to noise. In addition, the multiresolution histogram directly encodes spatial information. We describe a simple yet novel matching algorithm based on the multiresolution histogram that uses the differences between histograms of consecutive image resolutions. We evaluate it against five widely used image features. We show that with our simple feature we achieve or exceed the performance obtained with more complicated features. Further, we show our algorithm to be the most efficient and robust.


Asunto(s)
Algoritmos , Inteligencia Artificial , Interpretación Estadística de Datos , Interpretación de Imagen Asistida por Computador/métodos , Reconocimiento de Normas Patrones Automatizadas/métodos , Aumento de la Imagen/métodos , Reproducibilidad de los Resultados , Sensibilidad y Especificidad
19.
Appl Opt ; 42(3): 511-25, 2003 Jan 20.
Artículo en Inglés | MEDLINE | ID: mdl-12570274

RESUMEN

We present an approach for easily removing the effects of haze from passively acquired images. Our approach is based on the fact that usually the natural illuminating light scattered by atmospheric particles (airlight) is partially polarized. Optical filtering alone cannot remove the haze effects, except in restricted situations. Our method, however, stems from physics-based analysis that works under a wide range of atmospheric and viewing conditions, even if the polarization is low. The approach does not rely on specific scattering models such as Rayleigh scattering and does not rely on the knowledge of illumination directions. It can be used with as few as two images taken through a polarizer at different orientations. As a byproduct, the method yields a range map of the scene, which enables scene rendering as if imaged from different viewpoints. It also yields information about the atmospheric particles. We present experimental results of complete dehazing of outdoor scenes, in far-from-ideal conditions for polarization filtering. We obtain a great improvement of scene contrast and correction of color.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA