Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 23
Filtrar
1.
IEEE Trans Pattern Anal Mach Intell ; 46(7): 5174-5191, 2024 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-38376966

RESUMEN

As an emerging research practice leveraging recent advanced AI techniques, e.g. deep models based prediction and generation, Video Coding for Machines (VCM) is committed to bridging to an extent separate research tracks of video/image compression and feature compression, and attempts to optimize compactness and efficiency jointly from a unified perspective of high accuracy machine vision and full fidelity human vision. With the rapid advances of deep feature representation and visual data compression in mind, in this paper, we summarize VCM methodology and philosophy based on existing academia and industrial efforts. The development of VCM follows a general rate-distortion optimization, and the categorization of key modules or techniques is established including feature-assisted coding, scalable coding, intermediate feature compression/optimization, and machine vision targeted codec, from broader perspectives of vision tasks, analytics resources, etc. From previous works, it is demonstrated that, although existing works attempt to reveal the nature of scalable representation in bits when dealing with machine and human vision tasks, there remains a rare study in the generality of low bit rate representation, and accordingly how to support a variety of visual analytic tasks. Therefore, we investigate a novel visual information compression for the analytics taxonomy problem to strengthen the capability of compact visual representations extracted from multiple tasks for visual analytics. A new perspective of task relationships versus compression is revisited. By keeping in mind the transferability among different machine vision tasks (e.g. high-level semantic and mid-level geometry-related), we aim to support multiple tasks jointly at low bit rates. In particular, to narrow the dimensionality gap between neural network generated features extracted from pixels and a variety of machine vision features/labels (e.g. scene class, segmentation labels), a codebook hyperprior is designed to compress the neural network-generated features. As demonstrated in our experiments, this new hyperprior model is expected to improve feature compression efficiency by estimating the signal entropy more accurately, which enables further investigation of the granularity of abstracting compact features among different tasks.

2.
IEEE Trans Pattern Anal Mach Intell ; 45(8): 9439-9453, 2023 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-37022832

RESUMEN

Removing the undesired moiré patterns from images capturing the contents displayed on screens is of increasing research interest, as the need for recording and sharing the instant information conveyed by the screens is growing. Previous demoiréing methods provide limited investigations into the formation process of moiré patterns to exploit moiré-specific priors for guiding the learning of demoiréing models. In this paper, we investigate the moiré pattern formation process from the perspective of signal aliasing, and correspondingly propose a coarse-to-fine disentangling demoiréing framework. In this framework, we first disentangle the moiré pattern layer and the clean image with alleviated ill-posedness based on the derivation of our moiré image formation model. Then we refine the demoiréing results exploiting both the frequency domain features and edge attention, considering moiré patterns' property on spectrum distribution and edge intensity revealed in our aliasing based analysis. Experiments on several datasets show that the proposed method performs favorably against state-of-the-art methods. Besides, the proposed method is validated to adapt well to different data sources and scales, especially on the high-resolution moiré images.


Asunto(s)
Algoritmos , Topografía de Moiré
3.
IEEE Trans Pattern Anal Mach Intell ; 45(2): 1424-1441, 2023 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-35439129

RESUMEN

Reflection removal has been discussed for more than decades. This paper aims to provide the analysis for different reflection properties and factors that influence image formation, an up-to-date taxonomy for existing methods, a benchmark dataset, and the unified benchmarking evaluations for state-of-the-art (especially learning-based) methods. Specifically, this paper presents a SIngle-image Reflection Removal Plus dataset "SIR 2+ " with the new consideration for in-the-wild scenarios and glass with diverse color and unplanar shapes. We further perform quantitative and visual quality comparisons for state-of-the-art single-image reflection removal algorithms. Open problems for improving reflection removal algorithms are discussed at the end. Our dataset and follow-up update can be found at https://reflectionremoval.github.io/sir2data/.

4.
IEEE Trans Image Process ; 31: 1391-1405, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35038292

RESUMEN

In this paper, we make the first benchmark effort to elaborate on the superiority of using RAW images in the low light enhancement and develop a novel alternative route to utilize RAW images in a more flexible and practical way. Inspired by a full consideration on the typical image processing pipeline, we are inspired to develop a new evaluation framework, Factorized Enhancement Model (FEM), which decomposes the properties of RAW images into measurable factors and provides a tool for exploring how properties of RAW images affect the enhancement performance empirically. The empirical benchmark results show that the Linearity of data and Exposure Time recorded in meta-data play the most critical role, which brings distinct performance gains in various measures over the approaches taking the sRGB images as input. With the insights obtained from the benchmark results in mind, a RAW-guiding Exposure Enhancement Network (REENet) is developed, which makes trade-offs between the advantages and inaccessibility of RAW images in real applications in a way of using RAW images only in the training phase. REENet projects sRGB images into linear RAW domains to apply constraints with corresponding RAW images to reduce the difficulty of modeling training. After that, in the testing phase, our REENet does not rely on RAW images. Experimental results demonstrate not only the superiority of REENet to state-of-the-art sRGB-based methods and but also the effectiveness of the RAW guidance and all components.

5.
IEEE Trans Pattern Anal Mach Intell ; 44(10): 6854-6871, 2022 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-34310289

RESUMEN

Vehicle Re-Identification (ReID) is of great significance for public security and intelligent transportation. Large and comprehensive datasets are crucial for the development of vehicle ReID in model training and evaluation. However, existing datasets in this field have limitations in many aspects, including the constrained capture conditions, limited variation of vehicle appearances, and small scale of training and test set, etc. Hence, a new, large, and challenging benchmark for vehicle ReID is urgently needed. In this paper, we propose a large vehicle ReID dataset, called VERI-Wild 2.0, containing 825,042 images. It is captured using a city-scale surveillance camera system, consisting of 274 cameras covering a very large area over 200 km2. Specifically, the samples in our dataset present very rich appearance diversities thanks to the long time span collecting settings, unconstrained capturing viewpoints, various illumination conditions, diversified background environments, and different weather conditions. Furthermore, to facilitate more practical benchmarking, we define a challenging and large test set containing about 400K vehicle images that do not have any camera overlap with the training set. VERI-Wild 2.0 is expected to be able to facilitate the design, adaptation, development, and evaluation of different types of learning models for vehicle ReID. Besides, we also design a new method for vehicle ReID. We observe that orientation is a crucial factor for feature matching in vehicle ReID. To match vehicle pairs captured from similar orientations, the learned features are expected to capture specific detailed differential information for discriminating the visually similar yet different vehicles. In contrast, features are desired to capture the orientation invariant common information when matching samples captured from different orientations. Thus a novel disentangled feature learning network (DFNet) is proposed. It explicitly considers the orientation information for vehicle ReID, and concurrently learns the orientation specific and orientation common features that thus can be adaptively exploited via an adaptive matching scheme when dealing with matching pairs from similar or different orientations. The comprehensive experimental results show the effectiveness of our proposed method.

6.
IEEE Trans Image Process ; 30: 7815-7829, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34388092

RESUMEN

Unsupervised domain adaptive (UDA) person re-identification (re-ID) is a challenging task due to the missing of labels for the target domain data. To handle this problem, some recent works adopt clustering algorithms to off-line generate pseudo labels, which can then be used as the supervision signal for on-line feature learning in the target domain. However, the off-line generated labels often contain lots of noise that significantly hinders the discriminability of the on-line learned features, and thus limits the final UDA re-ID performance. To this end, we propose a novel approach, called Dual-Refinement, that jointly refines pseudo labels at the off-line clustering phase and features at the on-line training phase, to alternatively boost the label purity and feature discriminability in the target domain for more reliable re-ID. Specifically, at the off-line phase, a new hierarchical clustering scheme is proposed, which selects representative prototypes for every coarse cluster. Thus, labels can be effectively refined by using the inherent hierarchical information of person images. Besides, at the on-line phase, we propose an instant memory spread-out (IM-spread-out) regularization, that takes advantage of the proposed instant memory bank to store sample features of the entire dataset and enable spread-out feature learning over the entire training data instantly. Our Dual-Refinement method reduces the influence of noisy labels and refines the learned features within the alternative training process. Experiments demonstrate that our method outperforms the state-of-the-art methods by a large margin.

7.
IEEE Trans Image Process ; 30: 6715-6729, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34236966

RESUMEN

Unsupervised domain adaptation (UDA) on person Re-Identification (ReID) aims to transfer the knowledge from a labeled source domain to an unlabeled target domain. Recent works mainly optimize the ReID models with pseudo labels generated by unsupervised clustering on the target domain. However, the pseudo labels generated by the unsupervised clustering methods are often unreliable, due to the severe intra-person variations and complicated cluster structures in the practical application scenarios. In this work, to handle the complicated cluster structures, we propose a novel learnable Hierarchical Connectivity-Centered (HCC) clustering scheme by Graph Convolutional Networks (GCNs) to generate more reliable pseudo labels. Our HCC scheme learns the complicated cluster structure by hierarchically estimating the connectivity among samples from the vertex level to cluster level in a graph representation, and thereby progressively refines the pseudo labels. Additionally, to handle the intra-person variations in clustering, we propose a novel relation feature for HCC clustering, which exploits the identities from the source domain as references to represent target domain samples. Experiments demonstrate that our method is able to achieve state-of-the art performance on three challenging benchmarks.

8.
Artículo en Inglés | MEDLINE | ID: mdl-32857694

RESUMEN

Video coding, which targets to compress and reconstruct the whole frame, and feature compression, which only preserves and transmits the most critical information, stand at two ends of the scale. That is, one is with compactness and efficiency to serve for machine vision, and the other is with full fidelity, bowing to human perception. The recent endeavors in imminent trends of video compression, e.g. deep learning based coding tools and end-to-end image/video coding, and MPEG-7 compact feature descriptor standards, i.e. Compact Descriptors for Visual Search and Compact Descriptors for Video Analysis, promote the sustainable and fast development in their own directions, respectively. In this paper, thanks to booming AI technology, e.g. prediction and generation models, we carry out exploration in the new area, Video Coding for Machines (VCM), arising from the emerging MPEG standardization efforts1. Towards collaborative compression and intelligent analytics, VCM attempts to bridge the gap between feature coding for machine vision and video coding for human vision. Aligning with the rising Analyze then Compress instance Digital Retina, the definition, formulation, and paradigm of VCM are given first. Meanwhile, we systematically review state-of-the-art techniques in video compression and feature compression from the unique perspective of MPEG standardization, which provides the academic and industrial evidence to realize the collaborative compression of video and feature streams in a broad range of AI applications. Finally, we come up with potential VCM solutions, and the preliminary results have demonstrated the performance and efficiency gains. Further direction is discussed as well.

9.
IEEE Trans Pattern Anal Mach Intell ; 42(2): 494-501, 2020 02.
Artículo en Inglés | MEDLINE | ID: mdl-30676946

RESUMEN

In this paper, a feature boosting network is proposed for estimating 3D hand pose and 3D body pose from a single RGB image. In this method, the features learned by the convolutional layers are boosted with a new long short-term dependence-aware (LSTD) module, which enables the intermediate convolutional feature maps to perceive the graphical long short-term dependency among different hand (or body) parts using the designed Graphical ConvLSTM. Learning a set of features that are reliable and discriminatively representative of the pose of a hand (or body) part is difficult due to the ambiguities, texture and illumination variation, and self-occlusion in the real application of 3D pose estimation. To improve the reliability of the features for representing each body part and enhance the LSTD module, we further introduce a context consistency gate (CCG) in this paper, with which the convolutional feature maps are modulated according to their consistency with the context representations. We evaluate the proposed method on challenging benchmark datasets for 3D hand pose estimation and 3D full body pose estimation. Experimental results show the effectiveness of our method that achieves state-of-the-art performance on both of the tasks.


Asunto(s)
Mano/diagnóstico por imagen , Imagenología Tridimensional/métodos , Aprendizaje Automático , Postura/fisiología , Humanos , Reproducibilidad de los Resultados
10.
IEEE Trans Pattern Anal Mach Intell ; 42(6): 1453-1467, 2020 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-30762531

RESUMEN

Action prediction is to recognize the class label of an ongoing activity when only a part of it is observed. In this paper, we focus on online action prediction in streaming 3D skeleton sequences. A dilated convolutional network is introduced to model the motion dynamics in temporal dimension via a sliding window over the temporal axis. Since there are significant temporal scale variations in the observed part of the ongoing action at different time steps, a novel window scale selection method is proposed to make our network focus on the performed part of the ongoing action and try to suppress the possible incoming interference from the previous actions at each step. An activation sharing scheme is also proposed to handle the overlapping computations among the adjacent time steps, which enables our framework to run more efficiently. Moreover, to enhance the performance of our framework for action prediction with the skeletal input data, a hierarchy of dilated tree convolutions are also designed to learn the multi-level structured semantic representations over the skeleton joints at each frame. Our proposed approach is evaluated on four challenging datasets. The extensive experiments demonstrate the effectiveness of our method for skeleton-based online action prediction.

11.
IEEE Trans Pattern Anal Mach Intell ; 42(10): 2684-2701, 2020 10.
Artículo en Inglés | MEDLINE | ID: mdl-31095476

RESUMEN

Research on depth-based human activity analysis achieved outstanding performance and demonstrated the effectiveness of 3D representation for action recognition. The existing depth-based and RGB+D-based action recognition benchmarks have a number of limitations, including the lack of large-scale training samples, realistic number of distinct class categories, diversity in camera views, varied environmental conditions, and variety of human subjects. In this work, we introduce a large-scale dataset for RGB+D human action recognition, which is collected from 106 distinct subjects and contains more than 114 thousand video samples and 8 million frames. This dataset contains 120 different action classes including daily, mutual, and health-related activities. We evaluate the performance of a series of existing 3D activity analysis methods on this dataset, and show the advantage of applying deep learning methods for 3D-based human action recognition. Furthermore, we investigate a novel one-shot 3D activity recognition problem on our dataset, and a simple yet effective Action-Part Semantic Relevance-aware (APSR) framework is proposed for this task, which yields promising results for recognition of the novel action classes. We believe the introduction of this large-scale dataset will enable the community to apply, adapt, and develop various data-hungry learning techniques for depth-based and RGB+D-based human activity understanding.


Asunto(s)
Aprendizaje Profundo , Actividades Humanas/clasificación , Procesamiento de Imagen Asistido por Computador/métodos , Reconocimiento de Normas Patrones Automatizadas/métodos , Algoritmos , Benchmarking , Humanos , Semántica , Grabación en Video
12.
IEEE Trans Pattern Anal Mach Intell ; 42(12): 2969-2982, 2020 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-31180841

RESUMEN

Removing the undesired reflections from images taken through the glass is of broad application to various computer vision tasks. Non-learning based methods utilize different handcrafted priors such as the separable sparse gradients caused by different levels of blurs, which often fail due to their limited description capability to the properties of real-world reflections. In this paper, we propose a network with the feature-sharing strategy to tackle this problem in a cooperative and unified framework, by integrating image context information and the multi-scale gradient information. To remove the strong reflections existed in some local regions, we propose a statistic loss by considering the gradient level statistics between the background and reflections. Our network is trained on a new dataset with 3250 reflection images taken under diverse real-world scenes. Experiments on a public benchmark dataset show that the proposed method performs favorably against state-of-the-art methods.

13.
IEEE Trans Pattern Anal Mach Intell ; 42(3): 580-595, 2020 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-30475712

RESUMEN

In this paper, we propose a deep variational and structural hashing (DVStH) method to learn compact binary codes for multimedia retrieval. Unlike most existing deep hashing methods which use a series of convolution and fully-connected layers to learn binary features, we develop a probabilistic framework to infer latent feature representation inside the network. Then, we design a struct layer rather than a bottleneck hash layer, to obtain binary codes through a simple encoding procedure. By doing these, we are able to obtain binary codes discriminatively and generatively. To make it applicable to cross-modal scalable multimedia retrieval, we extend our method to a cross-modal deep variational and structural hashing (CM-DVStH). We design a deep fusion network with a struct layer to maximize the correlation between image-text input pairs during the training stage so that a unified binary vector can be obtained. We then design modality-specific hashing networks to handle the out-of-sample extension scenario. Specifically, we train a network for each modality which outputs a latent representation that is as close as possible to the binary codes which are inferred from the fusion network. Experimental results on five benchmark datasets are presented to show the efficacy of the proposed approach.

14.
IEEE Trans Image Process ; 28(8): 3794-3807, 2019 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-30835224

RESUMEN

The high similarities of different real-world vehicles and great diversities of the acquisition views pose grand challenges to vehicle re-identification (ReID), which traditionally maps the vehicle images into a high-dimensional embedding space for distance optimization, vehicle discrimination, and identification. To improve the discriminative capability and robustness of the ReID algorithm, we propose a novel end-to-end embedding adversarial learning network (EALN) that is capable of generating samples localized in the embedding space. Instead of selecting abundant hard negatives from the training set, which is extremely difficult if not impossible, with our embedding adversarial learning scheme, the automatically generated hard negative samples in the specified embedding space can greatly improve the capability of the network for discriminating similar vehicles. Moreover, the more challenging cross-view vehicle ReID problem, which requires the ReID algorithm to be robust with different query views, can also benefit from such a scheme based on the artificially generated cross-view samples. We demonstrate the promise of EALN through extensive experiments and show the effectiveness of hard negative and cross-view generation in facilitating vehicle ReID based on the comparisons with the state-of-the-art schemes.

15.
Artículo en Inglés | MEDLINE | ID: mdl-29994150

RESUMEN

Hashing, a widely-studied solution to the approximate nearest neighbor (ANN) search, aims to map data points in the high-dimensional Euclidean space to the low-dimensional Hamming space while preserving the similarity between original points. As directly learning binary codes can be NP-hard due to discrete constraints, a two-stage scheme, namely "projection and quantization", has already become a standard paradigm for learning similarity-preserving hash codes. However, most existing hashing methods typically separate these two stages and thus fail to investigate complementary effects of both stages. In this paper, we systematically study the relationship between "projection and quantization", and propose a novel minimal reconstruction bias hashing (MRH) method to learn compact binary codes, in which the projection learning and quantization optimizing are jointly performed. By introducing a lower bound analysis, we design an effective ternary search algorithm to solve the corresponding optimization problem. Furthermore, we conduct some insightful discussions on the proposed MRH approach, including the theoretical proof, and computational complexity. Distinct from previous works, MRH can adaptively adjust the projection dimensionality to balance the information loss between projection and quantization. The proposed framework not only provides a unique perspective to view traditional hashing methods but also evokes some other researches, e.g., guiding the design of the loss functions in deep networks. Extensive experiment results have shown that the proposed MRH significantly outperforms a variety of state-of-the-art methods over eight widely used benchmarks.

16.
Artículo en Inglés | MEDLINE | ID: mdl-29994443

RESUMEN

Removing the undesired reflections in images taken through the glass is of broad application to various image processing and computer vision tasks. Existing single image based solutions heavily rely on scene priors such as separable sparse gradients caused by different levels of blur, and they are fragile when such priors are not observed. In this paper, we notice that strong reflections usually dominant a limited region in the whole image, and propose a Region-aware Reflection Removal (R3) approach by automatically detecting and heterogeneously processing regions with and without reflections. We integrate content and gradient priors to jointly achieve missing contents restoration as well as background and reflection separation in a unified optimization framework. Extensive validation using 50 sets of real data shows that the proposed method outperforms state-of-the-art on both quantitative metrics and visual qualities.

17.
IEEE Trans Image Process ; 27(5): 2201-2216, 2018 May.
Artículo en Inglés | MEDLINE | ID: mdl-29432101

RESUMEN

The compact descriptors for visual search (CDVS) standard from ISO/IEC moving pictures experts group has succeeded in enabling the interoperability for efficient and effective image retrieval by standardizing the bitstream syntax of compact feature descriptors. However, the intensive computation of a CDVS encoder unfortunately hinders its widely deployment in industry for large-scale visual search. In this paper, we revisit the merits of low complexity design of CDVS core techniques and present a very fast CDVS encoder by leveraging the massive parallel execution resources of graphics processing unit (GPU). We elegantly shift the computation-intensive and parallel-friendly modules to the state-of-the-arts GPU platforms, in which the thread block allocation as well as the memory access mechanism are jointly optimized to eliminate performance loss. In addition, those operations with heavy data dependence are allocated to CPU for resolving the extra but non-necessary computation burden for GPU. Furthermore, we have demonstrated the proposed fast CDVS encoder can work well with those convolution neural network approaches which enables to leverage the advantages of GPU platforms harmoniously, and yield significant performance improvements. Comprehensive experimental results over benchmarks are evaluated, which has shown that the fast CDVS encoder using GPU-CPU hybrid computing is promising for scalable visual search.

18.
IEEE Trans Image Process ; 27(4): 1586-1599, 2018 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-29324413

RESUMEN

Human action recognition in 3D skeleton sequences has attracted a lot of research attention. Recently, long short-term memory (LSTM) networks have shown promising performance in this task due to their strengths in modeling the dependencies and dynamics in sequential data. As not all skeletal joints are informative for action recognition, and the irrelevant joints often bring noise which can degrade the performance, we need to pay more attention to the informative ones. However, the original LSTM network does not have explicit attention ability. In this paper, we propose a new class of LSTM network, global context-aware attention LSTM, for skeleton-based action recognition, which is capable of selectively focusing on the informative joints in each frame by using a global context memory cell. To further improve the attention capability, we also introduce a recurrent attention mechanism, with which the attention performance of our network can be enhanced progressively. Besides, a two-stream framework, which leverages coarse-grained attention and fine-grained attention, is also introduced. The proposed method achieves state-of-the-art performance on five challenging datasets for skeleton-based action recognition.


Asunto(s)
Actividades Humanas/clasificación , Redes Neurales de la Computación , Reconocimiento de Normas Patrones Automatizadas/métodos , Algoritmos , Bases de Datos Factuales , Humanos , Aprendizaje Automático , Memoria a Corto Plazo , Modelos Neurológicos
19.
IEEE Trans Pattern Anal Mach Intell ; 37(12): 2428-40, 2015 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-26539848

RESUMEN

There are two sides to every story of visual saliency modeling in the frequency domain. On the one hand, image saliency can be effectively estimated by applying simple operations to the frequency spectrum. On the other hand, it is still unclear which part of the frequency spectrum contributes the most to popping-out targets and suppressing distractors. Toward this end, this paper tentatively explores the secret of image saliency in the frequency domain. From the results obtained in several qualitative and quantitative experiments, we find that the secret of visual saliency may mainly hide in the phases of intermediate frequencies. To explain this finding, we reinterpret the concept of discrete Fourier transform from the perspective of template-based contrast computation and thus develop several principles for designing the saliency detector in the frequency domain. Following these principles, we propose a novel approach to design the saliency detector under the assistance of prior knowledge obtained through both unsupervised and supervised learning processes. Experimental results on a public image benchmark show that the learned saliency detector outperforms 18 state-of-the-art approaches in predicting human fixations.

20.
IEEE Trans Image Process ; 24(9): 2811-26, 2015 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-25966477

RESUMEN

The popularity of stereo images and various display devices poses the need of stereo image retargeting techniques. Existing warping-based retargeting methods can well preserve the shape of salient objects in a retargeted stereo image pair. Nevertheless, these methods often incur depth distortion, since they attempt to preserve depth by maintaining the disparity of a set of sparse correspondences, rather than directly controlling the warping. In this paper, by considering how to directly control the warping functions, we propose a warping-based stereo image retargeting approach that can simultaneously preserve the shape of salient objects and the depth of 3D scenes. We first characterize the depth distortion in terms of warping functions to investigate the impact of a warping function on depth distortion. Based on the depth distortion model, we then exploit binocular visual characteristics of stereo images to derive region-based depth-preserving constraints which directly control the warping functions so as to faithfully preserve the depth of 3D scenes. Third, with the region-based depth-preserving constraints, we present a novel warping-based stereo image retargeting framework. Since the depth-preserving constraints are derived regardless of shape preservation, we relax the depth-preserving constraints to fulfill a tradeoff between shape preservation and depth preservation. Finally, we propose a quad-based implementation of the proposed framework. The results demonstrate the efficacy of our method in both depth and shape preservation for stereo image retargeting.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...