RESUMO
With the recent widespread interest for head-mounted displays applied to virtual or augmented reality, holography has been considered as an appealing technique for a revolutionary and natural 3D visualization system. However, due to the tremendous amount of data required by holograms and to the very different properties of holographic data compared to common imagery, compression of digital holograms is a highly challenging topic for researchers. In this study, we introduce a novel approach, to the best of our knowledge, for color hologram compression based on matching pursuit using an overcomplete Gabor's dictionary. A detailed framework, together with a GPU implementation, from hologram decomposition to bitstream generation, is studied, and the results are discussed and compared to existing hologram compression algorithms.
RESUMO
Holographic data play a crucial role in recent three-dimensional imaging as well as microscopic applications. As a result, huge amounts of storage capacity will be involved for this kind of data. Therefore, it becomes necessary to develop efficient hologram compression schemes for storage and transmission purposes. In this paper, we focus on the shifted distance information, obtained by the phase-shifting algorithm, where two sets of difference data need to be encoded. More precisely, a nonseparable vector lifting scheme is investigated in order to exploit the two-dimensional characteristics of the holographic contents. Simulations performed on different digital holograms have shown the effectiveness of the proposed method in terms of bitrate saving and quality of object reconstruction.
RESUMO
Transmission and compression technologies advancement over the past decade led to a shift of multimedia content towards cloud systems. Multiple copies of the same video are available through numerous distribution systems. Different compression levels, algorithms and resolutions are used to match the requirements of particular applications. As 4k display technologies are rapidly adopted, resolution enhancement algorithms are of vital importance. Current solutions do not take into account the particularities of different video encoders, while video reconstruction methods from compressed sources do not provide resolution enhancement. In this paper, we propose a multi source compressed video enhancement framework, where each description can have a different compression level and resolution. Using a variational formulation based on a modern proximal dual splitting algorithm, we efficiently combine multiple descriptions of the same video. Two applications are proposed: combining two compressed low resolution (LR) descriptions of a video sequence into a high resolution (HR) description and enhancing a compressed HR video using a LR compressed description. Tests are performed over multiple video sequences encoded with high efficiency video coding, at different compression levels and resolutions obtained through multiple down-sampling methods.
RESUMO
Optimal rate allocation is among the most challenging tasks to perform in the context of predictive video coding, because of the dependencies between frames induced by motion compensation. In this paper, using a recursive rate-distortion model that explicitly takes into account these dependencies, we approach the frame-level rate allocation as a convex optimization problem. This technique is integrated into the recent HEVC encoder, and tested on several standard sequences. Experiments indicate that the proposed rate allocation ensures a better performance (in the rate-distortion sense) than the standard HEVC rate control, and with a little loss with respect to an optimal exhaustive research, which is largely compensated by a much shorter execution time.
RESUMO
Recent breakthroughs in motion-compensated temporal wavelet filtering have finally enabled implementation of highly efficient scalable and error-resilient video codecs. These new wavelet codecs provide numerous advantages over nonscalable conventional solutions techniques based on motion-compensated prediction, such as no recursive predictive loop, separation of noise and sampling artifacts from the content through use of longer temporal filters, removal of long range as well as short range temporal redundancies, etc. Moreover, these wavelet video coding schemes can provide flexible spatial, temporal, signal-to-noise ratio and complexity scalability with fine granularity over a large range of bit rates, while maintaining a very high coding efficiency. However, most motion-compensated wavelet video schemes are based on classical two-band decompositions that offer only dyadic factors of temporal scalability. In this paper, we propose a three-band temporal structure that extends the concept of motion-compensated temporal filtering (MCTF) that was introduced in the classical lifting framework. These newly introduced structures provide higher temporal scalability flexibility, as well as improved compression performance compared with dyadic Haar MCTF.
Assuntos
Compressão de Dados/métodos , Aumento da Imagem/métodos , Interpretação de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Fotografação/métodos , Processamento de Sinais Assistido por Computador , Gravação em Vídeo/métodos , Algoritmos , Análise Numérica Assistida por Computador , Técnica de Subtração , Fatores de TempoRESUMO
Augmented reality, interactive navigation in 3D scenes, multiview video, and other emerging multimedia applications require large sets of images, hence larger data volumes and increased resources compared with traditional video services. The significant increase in the number of images in multiview systems leads to new challenging problems in data representation and data transmission to provide high quality of experience on resource-constrained environments. In order to reduce the size of the data, different multiview video compression strategies have been proposed recently. Most of them use the concept of reference or key views that are used to estimate other images when there is high correlation in the data set. In such coding schemes, the two following questions become fundamental: 1) how many reference views have to be chosen for keeping a good reconstruction quality under coding cost constraints? And 2) where to place these key views in the multiview data set? As these questions are largely overlooked in the literature, we study the reference view selection problem and propose an algorithm for the optimal selection of reference views in multiview coding systems. Based on a novel metric that measures the similarity between the views, we formulate an optimization problem for the positioning of the reference views, such that both the distortion of the view reconstruction and the coding rate cost are minimized. We solve this new problem with a shortest path algorithm that determines both the optimal number of reference views and their positions in the image set. We experimentally validate our solution in a practical multiview distributed coding system and in the standardized 3D-HEVC multiview coding scheme. We show that considering the 3D scene geometry in the reference view, positioning problem brings significant rate-distortion improvements and outperforms the traditional coding strategy that simply selects key frames based on the distance between cameras.
RESUMO
In this paper, we develop an efficient bit allocation strategy for subband-based image coding systems. More specifically, our objective is to design a new optimization algorithm based on a rate-distortion optimality criterion. To this end, we consider the uniform scalar quantization of a class of mixed distributed sources following a Bernoulli-generalized Gaussian distribution. This model appears to be particularly well-adapted for image data, which have a sparse representation in a wavelet basis. In this paper, we propose new approximations of the entropy and the distortion functions using piecewise affine and exponential forms, respectively. Because of these approximations, bit allocation is reformulated as a convex optimization problem. Solving the resulting problem allows us to derive the optimal quantization step for each subband. Experimental results show the benefits that can be drawn from the proposed bit allocation method in a typical transform-based coding application.
Assuntos
Algoritmos , Artefatos , Aumento da Imagem/métodos , Interpretação de Imagem Assistida por Computador/métodos , Processamento de Sinais Assistido por Computador , Interpretação Estatística de Dados , Reprodutibilidade dos Testes , Tamanho da Amostra , Sensibilidade e EspecificidadeRESUMO
Nonlocal total variation (NLTV) has emerged as a useful tool in variational methods for image recovery problems. In this paper, we extend the NLTV-based regularization to multicomponent images by taking advantage of the structure tensor (ST) resulting from the gradient of a multicomponent image. The proposed approach allows us to penalize the nonlocal variations, jointly for the different components, through various l(1, p)-matrix-norms with p ≥ 1. To facilitate the choice of the hyperparameters, we adopt a constrained convex optimization approach in which we minimize the data fidelity term subject to a constraint involving the ST-NLTV regularization. The resulting convex optimization problem is solved with a novel epigraphical projection method. This formulation can be efficiently implemented because of the flexibility offered by recent primal-dual proximal algorithms. Experiments are carried out for color, multispectral, and hyperspectral images. The results demonstrate the interest of introducing a nonlocal ST regularization and show that the proposed approach leads to significant improvements in terms of convergence speed over current state-of-the-art methods, such as the alternating direction method of multipliers.
RESUMO
Many research efforts have been devoted to the improvement of stereo image coding techniques for storage or transmission. In this paper, we are mainly interested in lossy-to-lossless coding schemes for stereo images allowing progressive reconstruction. The most commonly used approaches for stereo compression are based on disparity compensation techniques. The basic principle involved in this technique first consists of estimating the disparity map. Then, one image is considered as a reference and the other is predicted in order to generate a residual image. In this paper, we propose a novel approach, based on vector lifting schemes (VLS), which offers the advantage of generating two compact multiresolution representations of the left and the right views. We present two versions of this new scheme. A theoretical analysis of the performance of the considered VLS is also conducted. Experimental results indicate a significant improvement using the proposed structures compared with conventional methods.