RESUMO
This paper proposes a novel algorithm of discovering the structure of a kaleidoscopic imaging system that consists of multiple planar mirrors and a camera. The kaleidoscopic imaging system can be recognized as the virtual multi-camera system and has strong advantages in that the virtual cameras are strictly synchronized and have the same intrinsic parameters. In this paper, we focus on the extrinsic calibration of the virtual multi-camera system. The problems to be solved in this paper are two-fold. The first problem is to identify to which mirror chamber each of the 2D projections of mirrored 3D points belongs. The second problem is to estimate all mirror parameters, i.e., normals, and distances of the mirrors. The key contribution of this paper is to propose novel algorithms for these problems using a single 3D point of unknown geometry by utilizing a kaleidoscopic projection constraint, which is an epipolar constraint on mirror reflections. We demonstrate the performance of the proposed algorithm of chamber assignment and estimation of mirror parameters with qualitative and quantitative evaluations using synthesized and real data.
RESUMO
We introduce a novel neural network-based BRDF model and a Bayesian framework for object inverse rendering, i.e., joint estimation of reflectance and natural illumination from a single image of an object of known geometry. The BRDF is expressed with an invertible neural network, namely, normalizing flow, which provides the expressive power of a high-dimensional representation, computational simplicity of a compact analytical model, and physical plausibility of a real-world BRDF. We extract the latent space of real-world reflectance by conditioning this model, which directly results in a strong reflectance prior. We refer to this model as the invertible neural BRDF model (iBRDF). We also devise a deep illumination prior by leveraging the structural bias of deep neural networks. By integrating this novel BRDF model and reflectance and illumination priors in a MAP estimation formulation, we show that this joint estimation can be computed efficiently with stochastic gradient descent. We experimentally validate the accuracy of the invertible neural BRDF model on a large number of measured data and demonstrate its use in object inverse rendering on a number of synthetic and real images. The results show new ways in which deep neural networks can help solve challenging radiometric inverse problems.
Assuntos
Algoritmos , Redes Neurais de Computação , Teorema de Bayes , IluminaçãoRESUMO
In this paper, we introduce a novel method for reconstructing surface normals and depth of dynamic objects in water. Past shape recovery methods have leveraged various visual cues for estimating shape (e.g., depth) or surface normals. Methods that estimate both compute one from the other. We show that these two geometric surface properties can be simultaneously recovered for each pixel when the object is observed underwater. Our key idea is to leverage multi-wavelength near-infrared light absorption along different underwater light paths in conjunction with surface shading. Our method can handle both Lambertian and non-Lambertian surfaces. We derive a principled theory for this surface normals and shape from water method and a practical calibration method for determining its imaging parameters values. By construction, the method can be implemented as a one-shot imaging system. We prototype both an off-line and a video-rate imaging system and demonstrate the effectiveness of the method on a number of real-world static and dynamic objects. The results show that the method can recover intricate surface features that are otherwise inaccessible.
RESUMO
We introduce a novel 3D sensing method for recovering a consistent, dense 3D shape of a dynamic, non-rigid object in water. The method reconstructs a complete (or fuller) 3D surface of the target object in a canonical frame (e.g., rest shape) as it freely deforms and moves between frames by estimating underwater 3D scene flow and using it to integrate per-frame depth estimates recovered from two near-infrared observations. The reconstructed shape is refined in the course of this global non-rigid shape recovery by leveraging both geometric and radiometric constraints. We implement our method with a single camera and a light source without the orthographic assumption on either by deriving a practical calibration method that estimates the point source position with respect to the camera. Our reconstruction method also accounts for scattering by water. We prototype a video-rate imaging system and show 3D shape reconstruction results on a number of real-world static, deformable, and dynamic objects and creatures in real-world water. The results demonstrate the effectiveness of the method in recovering complete shapes of complex, non-rigid objects in water, which opens new avenues of application for underwater 3D sensing in the sub-meter range.
RESUMO
We present an approach to capture the 3D motion of a group of people engaged in a social interaction. The core challenges in capturing social interactions are: (1) occlusion is functional and frequent; (2) subtle motion needs to be measured over a space large enough to host a social group; (3) human appearance and configuration variation is immense; and (4) attaching markers to the body may prime the nature of interactions. The Panoptic Studio is a system organized around the thesis that social interactions should be measured through the integration of perceptual analyses over a large variety of view points. We present a modularized system designed around this principle, consisting of integrated structural, hardware, and software innovations. The system takes, as input, 480 synchronized video streams of multiple people engaged in social activities, and produces, as output, the labeled time-varying 3D structure of anatomical landmarks on individuals in the space. Our algorithm is designed to fuse the "weak" perceptual processes in the large number of views by progressively generating skeletal proposals from low-level appearance cues, and a framework for temporal refinement is also presented by associating body parts to reconstructed dense 3D trajectory stream. Our system and method are the first in reconstructing full body motion of more than five people engaged in social interactions without using markers. We also empirically demonstrate the impact of the number of views in achieving this goal.