Pesquisa | Portal Regional da BVS

1.

Learning with Style: Continual Semantic Segmentation Across Tasks and Domains.

Toldo, Marco; Michieli, Umberto; Zanuttigh, Pietro.

IEEE Trans Pattern Anal Mach Intell ; PP2024 May 07.

Artigo em Inglês | MEDLINE | ID: mdl-38713563

RESUMO

Deep learning models dealing with image understanding in real-world settings must be able to adapt to a wide variety of tasks across different domains. Domain adaptation and class incremental learning deal with domain and task variability separately, whereas their unified solution is still an open problem. We tackle both facets of the problem together, taking into account the semantic shift within both input and label spaces. We start by formally introducing continual learning under task and domain shift. Then, we address the proposed setup by using style transfer techniques to extend knowledge across domains when learning incremental tasks and a robust distillation framework to effectively recollect task knowledge under incremental domain shift. The devised framework (LwS, Learning with Style) is able to generalize incrementally acquired task knowledge across all the domains encountered, proving to be robust against catastrophic forgetting. Extensive experimental evaluation on multiple autonomous driving datasets shows how the proposed method outperforms existing approaches, which prove to be ill-equipped to deal with continual semantic segmentation under both task and domain shift. The code is available at https://lttm.dei.unipd.it/paper data/LwS.

2.

Reframing control methods for parameters optimization in adversarial image generation.

Alfalouji, Qamar; Sartor, Piergiorgio; Zanuttigh, Pietro.

Neural Netw ; 153: 303-313, 2022 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-35772251

RESUMO

Training procedures for deep networks require the setting of several hyper-parameters that strongly affect the obtained results. The problem is even worse in adversarial learning strategies used for image generation where a proper balancing of the discriminative and generative networks is fundamental for an effective training. In this work we propose a novel hyper-parameters optimization strategy based on the use of Proportional-Integral (PI) and Proportional-Integral-Derivative (PID) controllers. Both open loop and closed loop schemes for the tuning of a single parameter or of multiple parameters together are proposed allowing an efficient parameter tuning without resorting to computationally demanding trial-and-error schemes. We applied the proposed strategies to the widely used BEGAN and CycleGAN models: They allowed to achieve a more stable training that converges faster. The obtained images are also sharper with a slightly better quality both visually and according to the FID and FCN metrics. Image translation results also showed better background preservation and less color artifacts with respect to CycleGAN.

Assuntos

Artefatos , Processamento de Imagem Assistida por Computador , Processamento de Imagem Assistida por Computador/métodos

3.

Unsupervised Domain Adaptation of Deep Networks for ToF Depth Refinement.

Agresti, Gianluca; Schafer, Henrik; Sartor, Piergiorgio; Incesu, Yalcin; Zanuttigh, Pietro.

IEEE Trans Pattern Anal Mach Intell ; 44(12): 9195-9208, 2022 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-34714740

RESUMO

Depth maps acquired with ToF cameras have a limited accuracy due to the high noise level and to the multi-path interference. Deep networks can be used for refining ToF depth, but their training requires real world acquisitions with ground truth, which is complex and expensive to collect. A possible workaround is to train networks on synthetic data, but the domain shift between the real and synthetic data reduces the performances. In this paper, we propose three approaches to perform unsupervised domain adaptation of a depth denoising network from synthetic to real data. These approaches are respectively acting at the input, at the feature and at the output level of the network. The first approach uses domain translation networks to transform labeled synthetic ToF data into a representation closer to real data, that is then used to train the denoiser. The second approach tries to align the network internal features related to synthetic and real data. The third approach uses an adversarial loss, implemented with a discriminator trained to recognize the ground truth statistic, to train the denoiser on unlabeled real data. Experimental results show that the considered approaches are able to outperform other state-of-the-art techniques and achieve superior denoising performances.

4.

Deep Learning for Transient Image Reconstruction from ToF Data.

Buratto, Enrico; Simonetto, Adriano; Agresti, Gianluca; Schäfer, Henrik; Zanuttigh, Pietro.

Sensors (Basel) ; 21(6)2021 Mar 11.

Artigo em Inglês | MEDLINE | ID: mdl-33799603

RESUMO

In this work, we propose a novel approach for correcting multi-path interference (MPI) in Time-of-Flight (ToF) cameras by estimating the direct and global components of the incoming light. MPI is an error source linked to the multiple reflections of light inside a scene; each sensor pixel receives information coming from different light paths which generally leads to an overestimation of the depth. We introduce a novel deep learning approach, which estimates the structure of the time-dependent scene impulse response and from it recovers a depth image with a reduced amount of MPI. The model consists of two main blocks: a predictive model that learns a compact encoded representation of the backscattering vector from the noisy input data and a fixed backscattering model which translates the encoded representation into the high dimensional light response. Experimental results on real data show the effectiveness of the proposed approach, which reaches state-of-the-art performances.

5.

A multi-camera dataset for depth estimation in an indoor scenario.

Marin, Giulio; Agresti, Gianluca; Minto, Ludovico; Zanuttigh, Pietro.

Data Brief ; 27: 104619, 2019 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-31687438

RESUMO

Time-of-Flight (ToF) sensors and stereo vision systems are two of the most diffused depth acquisition devices for commercial and industrial applications. They share complementary strengths and weaknesses. For this reason, the combination of data acquired from these devices can improve the final depth estimation accuracy. This paper introduces a dataset acquired with a multi-camera system composed by a Microsoft Kinect v2 ToF sensor, an Intel RealSense R200 active stereo sensor and a Stereolabs ZED passive stereo camera system. The acquired scenes include indoor settings with different external lighting conditions. The depth ground truth has been acquired for each scene of the dataset using a line laser. The data can be used for developing fusion and denoising algorithms for depth estimation and test with different lighting conditions. A subset of the data has already been used for the experimental evaluation of the work "Stereo and ToF Data Fusion by Learning from Synthetic Data".

6.

Probabilistic ToF and stereo data fusion based on mixed pixels measurement models.

Mutto, Carlo Dal; Zanuttigh, Pietro; Cortelazzo, Guido Maria.

IEEE Trans Pattern Anal Mach Intell ; 37(11): 2260-72, 2015 Nov.

Artigo em Inglês | MEDLINE | ID: mdl-26440266

RESUMO

This paper proposes a method for fusing data acquired by a ToF camera and a stereo pair based on a model for depth measurement by ToF cameras which accounts also for depth discontinuity artifacts due to the mixed pixel effect. Such model is exploited within both a ML and a MAP-MRF frameworks for ToF and stereo data fusion. The proposed MAP-MRF framework is characterized by site-dependent range values, a rather important feature since it can be used both to improve the accuracy and to decrease the computational complexity of standard MAP-MRF approaches. This paper, in order to optimize the site dependent global cost function characteristic of the proposed MAP-MRF approach, also introduces an extension to Loopy Belief Propagation which can be used in other contexts. Experimental data validate the proposed ToF measurements model and the effectiveness of the proposed fusion techniques.

7.

Scalable coding of depth maps with R-D optimized embedding.

Mathew, Reji; Taubman, David; Zanuttigh, Pietro.

IEEE Trans Image Process ; 22(5): 1982-95, 2013 May.

Artigo em Inglês | MEDLINE | ID: mdl-23335671

RESUMO

Recent work on depth map compression has revealed the importance of incorporating a description of discontinuity boundary geometry into the compression scheme. We propose a novel compression strategy for depth maps that incorporates geometry information while achieving the goals of scalability and embedded representation. Our scheme involves two separate image pyramid structures, one for breakpoints and the other for sub-band samples produced by a breakpoint-adaptive transform. Breakpoints capture geometric attributes, and are amenable to scalable coding. We develop a rate-distortion optimization framework for determining the presence and precision of breakpoints in the pyramid representation. We employ a variation of the EBCOT scheme to produce embedded bit-streams for both the breakpoint and sub-band data. Compared to JPEG 2000, our proposed scheme enables the same the scalability features while achieving substantially improved rate-distortion performance at the higher bit-rate range and comparable performance at the lower rates.

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA