RESUMO
A Fresnel Zone Aperture (FZA) mask for a lensless camera, an ultra-thin and functional computational imaging system, is beneficial because the FZA pattern makes it easy to model the imaging process and reconstruct captured images through a simple and fast deconvolution. However, diffraction causes a mismatch between the forward model used in the reconstruction and the actual imaging process, which affects the recovered image's resolution. This work theoretically analyzes the wave-optics imaging model of an FZA lensless camera and focuses on the zero points caused by diffraction in the frequency response. We propose a novel idea of image synthesis to compensate for the zero points through two different realizations based on the linear least-mean-square-error (LMSE) estimation. Results from computer simulation and optical experiments verify a nearly two-fold improvement in spatial resolution from the proposed methods compared with the conventional geometrical-optics-based method.
RESUMO
This study proposes a novel computational imaging system that integrates a see-through screen (STS) with volume holographic optical elements (vHOEs) and a digital camera unit. Because of the unique features of the vHOE, the STS can function as a holographic waveguide device (HWD) and enable the camera to capture the frontal image when the user gazes at the screen. This system not only provides an innovative solution to a high-quality video communication system by realizing eye-contact but also contributes to other visual applications due to its refined structure. However, there is a dilemma in the proposed imaging system: for a wider field of view, a larger vHOE is necessary. If the size of the vHOE is larger, the light rays from the same object point are diffracted at different Bragg conditions and reflect a different number of times, which causes blurring of the captured image. The system imaging process is analyzed by ray tracing, and a digital image reconstruction method was employed to obtain a clear picture in this study. Optical experiments confirmed the effectiveness of the proposed HWD-STS camera.
RESUMO
A mask-based lensless camera optically encodes the scene with a thin mask and reconstructs the image afterward. The improvement of image reconstruction is one of the most important subjects in lensless imaging. Conventional model-based reconstruction approaches, which leverage knowledge of the physical system, are susceptible to imperfect system modeling. Reconstruction with a pure data-driven deep neural network (DNN) avoids this limitation, thereby having potential to provide a better reconstruction quality. However, existing pure DNN reconstruction approaches for lensless imaging do not provide a better result than model-based approaches. We reveal that the multiplexing property in lensless optics makes global features essential in understanding the optically encoded pattern. Additionally, all existing DNN reconstruction approaches apply fully convolutional networks (FCNs) which are not efficient in global feature reasoning. With this analysis, for the first time to the best of our knowledge, a fully connected neural network with a transformer for image reconstruction is proposed. The proposed architecture is better in global feature reasoning, and hence enhances the reconstruction. The superiority of the proposed architecture is verified by comparing with the model-based and FCN-based approaches in an optical experiment.
Assuntos
Processamento de Imagem Assistida por Computador , Redes Neurais de Computação , Diagnóstico por Imagem , Humanos , Processamento de Imagem Assistida por Computador/métodosRESUMO
We propose a preliminary lensless inference camera (LLI camera) specialized for object recognition. The LLI camera performs computationally efficient data preprocessing on the optically encoded pattern through the mask, rather than performing computationally expensive image reconstruction before inference. Therefore, the LLI camera avoids expensive computation and achieves real-time inference. This work proposes a new data preprocessing approach, named local binary patterns map generation, dedicated for optically encoded pattern through the mask. This preprocessing approach greatly improves encoded pattern's robustness to local disturbances in the scene, making the LLI camera's practical application possible. The performance of the LLI camera is analyzed through optical experiments on handwritten digit recognition and gender estimation under conditions with changing illumination and a moving target.
RESUMO
A mask-based lensless camera adopts a thin mask to optically encode the scene and records the encoded pattern on an image sensor. The lensless camera can be thinner, lighter and cheaper than the lensed camera. But additional computation is required to reconstruct an image from the encoded pattern. Considering that the significant application of the lensless camera could be inference, we propose to perform object recognition directly on the encoded pattern. Avoiding image reconstruction not only saves computational resources but also averts errors and artifacts in reconstruction. We theoretically analyze multiplexing property in mask-based lensless optics which maps local information in the scene to overlapping global information in the encoded pattern. To better extract global features, we propose a simplified Transformer-based architecture. This is the first time to study Transformer-based architecture for encoded pattern recognition in mask-based lensless optics. In the optical experiment, the proposed system achieves 91.47% accuracy on the Fashion MNIST and 96.64% ROC AUC on the cats-vs-dogs dataset. The feasibility of physical object recognition is also evaluated.
RESUMO
Light field imaging is a promising technique for recording and displaying three-dimensional (3D) scenes. Light field reconstruction using a conventional camera, instead of a lens-array-based plenoptic camera, is expected to have a higher resolution. However, in existing conventional-camera-based methods, the camera placement is restricted. In this paper, we propose a new flexible light field reconstruction method involving shooting using a conventional camera from random 3D positions and angles. The proposed method has the advantage of flexibility because it uses a conventional camera, and the shooting position and angle can be selected randomly. Such flexibility is expected to give it a wide range of applications.