RESUMO
In recent years, point clouds have become increasingly popular for representing three-dimensional (3D) visual objects and scenes. To efficiently store and transmit point clouds, compression methods have been developed, but they often result in a degradation of quality. To reduce color distortion in point clouds, we propose a graph-based quality enhancement network (GQE-Net) that uses geometry information as an auxiliary input and graph convolution blocks to extract local features efficiently. Specifically, we use a parallel-serial graph attention module with a multi-head graph attention mechanism to focus on important points or features and help them fuse together. Additionally, we design a feature refinement module that takes into account the normals and geometry distance between points. To work within the limitations of GPU memory capacity, the distorted point cloud is divided into overlap-allowed 3D patches, which are sent to GQE-Net for quality enhancement. To account for differences in data distribution among different color components, three models are trained for the three color components. Experimental results show that our method achieves state-of-the-art performance. For example, when implementing GQE-Net on a recent test model of the geometry-based point cloud compression (G-PCC) standard, 0.43 dB, 0.25 dB and 0.36 dB Bjφntegaard delta (BD)-peak-signal-to-noise ratio (PSNR), corresponding to 14.0%, 9.3% and 14.5% BD-rate savings were achieved on dense point clouds for the Y, Cb, and Cr components, respectively. The source code of our method is available at https://github.com/xjr998/GQE-Net.
RESUMO
We propose a generative adversarial network for point cloud upsampling, which can not only make the upsampled points evenly distributed on the underlying surface but also efficiently generate clean high frequency regions. The generator of our network includes a dynamic graph hierarchical residual aggregation unit and a hierarchical residual aggregation unit for point feature extraction and upsampling, respectively. The former extracts multiscale point-wise descriptive features, while the latter captures rich feature details with hierarchical residuals. To generate neat edges, our discriminator uses a graph filter to extract and retain high frequency points. The generated high resolution point cloud and corresponding high frequency points help the discriminator learn the global and high frequency properties of the point cloud. We also propose an identity distribution loss function to make sure that the upsampled points remain on the underlying surface of the input low resolution point cloud. To assess the regularity of the upsampled points in high frequency regions, we introduce two evaluation metrics. Objective and subjective results demonstrate that the visual quality of the upsampled point clouds generated by our method is better than that of the state-of-the-art methods.
RESUMO
Mismatches between the precisions of representing the disparity, depth value and rendering position in 3D video systems cause redundancies in depth map representations. In this paper, we propose a highly efficient multiview depth coding scheme based on Depth Histogram Projection (DHP) and Allowable Depth Distortion (ADD) in view synthesis. Firstly, DHP exploits the sparse representation of depth maps generated from stereo matching to reduce the residual error from INTER and INTRA predictions in depth coding. We provide a mathematical foundation for DHP-based lossless depth coding by theoretically analyzing its rate-distortion cost. Then, due to the mismatch between depth value and rendering position, there is a many-to-one mapping relationship between them in view synthesis, which induces the ADD model. Based on this ADD model and DHP, depth coding with lossless view synthesis quality is proposed to further improve the compression performance of depth coding while maintaining the same synthesized video quality. Experimental results reveal that the proposed DHP based depth coding can achieve an average bit rate saving of 20.66% to 19.52% for lossless coding on Multiview High Efficiency Video Coding (MV-HEVC) with different groups of pictures. In addition, our depth coding based on DHP and ADD achieves an average depth bit rate reduction of 46.69%, 34.12% and 28.68% for lossless view synthesis quality when the rendering precision varies from integer, half to quarter pixels, respectively. We obtain similar gains for lossless depth coding on the 3D-HEVC, HEVC Intra coding and JPEG2000 platforms.
RESUMO
In rate-distortion optimization, the encoder settings are determined by maximizing a reconstruction quality measure subject to a constraint on the bitrate. One of the main challenges of this approach is to define a quality measure that can be computed with low computational cost and which correlates well with the perceptual quality. While several quality measures that fulfil these two criteria have been developed for images and videos, no such one exists for point clouds. We address this limitation for the video-based point cloud compression (V-PCC) standard by proposing a linear perceptual quality model whose variables are the V-PCC geometry and color quantization step sizes and whose coefficients can easily be computed from two features extracted from the original point cloud. Subjective quality tests with 400 compressed point clouds show that the proposed model correlates well with the mean opinion score, outperforming state-of-the-art full reference objective measures in terms of Spearman rank-order and Pearson linear correlation coefficient. Moreover, we show that for the same target bitrate, rate-distortion optimization based on the proposed model offers higher perceptual quality than rate-distortion optimization based on exhaustive search with a point-to-point objective quality metric. Our datasets are publicly available at https://github.com/qdushl/Waterloo-Point-Cloud-Database-2.0.
RESUMO
We consider a joint source-channel coding system that protects an embedded bitstream using a finite family of channel codes with error detection and error correction capability. The performance of this system may be measured by the expected distortion or by the expected number of correctly decoded source bits. Whereas a rate-based optimal solution can be found in linear time, the computation of a distortion-based optimal solution is prohibitive. Under the assumption of the convexity of the operational distortion-rate function of the source coder, we give a lower bound on the expected distortion of a distortion-based optimal solution that depends only on a rate-based optimal solution. Then, we propose a local search (LS) algorithm that starts from a rate-based optimal solution and converges in linear time to a local minimum of the expected distortion. Experimental results for a binary symmetric channel show that our LS algorithm is near optimal, whereas its complexity is much lower than that of the previous best solution.