Pesquisa | BVS IEC

Bounoua, Mustapha; Franzese, Giulio; Michiardi, Pietro.

Entropy (Basel) ; 26(4)2024 Apr 05.

Artigo em Inglês | MEDLINE | ID: mdl-38667874

RESUMO

Multimodal datasets are ubiquitous in modern applications, and multimodal Variational Autoencoders are a popular family of models that aim to learn a joint representation of different modalities. However, existing approaches suffer from a coherence-quality tradeoff in which models with good generation quality lack generative coherence across modalities and vice versa. In this paper, we discuss the limitations underlying the unsatisfactory performance of existing methods in order to motivate the need for a different approach. We propose a novel method that uses a set of independently trained and unimodal deterministic autoencoders. Individual latent variables are concatenated into a common latent space, which is then fed to a masked diffusion model to enable generative modeling. We introduce a new multi-time training method to learn the conditional score network for multimodal diffusion. Our methodology substantially outperforms competitors in both generation quality and coherence, as shown through an extensive experimental campaign.

How Much Is Enough? A Study on Diffusion Times in Score-Based Generative Models.

Franzese, Giulio; Rossi, Simone; Yang, Lixuan; Finamore, Alessandro; Rossi, Dario; Filippone, Maurizio; Michiardi, Pietro.

Entropy (Basel) ; 25(4)2023 Apr 07.

Artigo em Inglês | MEDLINE | ID: mdl-37190421

RESUMO

Score-based diffusion models are a class of generative models whose dynamics is described by stochastic differential equations that map noise into data. While recent works have started to lay down a theoretical foundation for these models, a detailed understanding of the role of the diffusion time T is still lacking. Current best practice advocates for a large T to ensure that the forward dynamics brings the diffusion sufficiently close to a known and simple noise distribution; however, a smaller value of T should be preferred for a better approximation of the score-matching objective and higher computational efficiency. Starting from a variational interpretation of diffusion models, in this work we quantify this trade-off and suggest a new method to improve quality and efficiency of both training and sampling, by adopting smaller diffusion times. Indeed, we show how an auxiliary model can be used to bridge the gap between the ideal and the simulated forward dynamics, followed by a standard reverse diffusion process. Empirical results support our analysis; for image data, our method is competitive with regard to the state of the art, according to standard sample quality metrics and log-likelihood.

A Scalable Bayesian Sampling Method Based on Stochastic Gradient Descent Isotropization.

Franzese, Giulio; Milios, Dimitrios; Filippone, Maurizio; Michiardi, Pietro.

Entropy (Basel) ; 23(11)2021 Oct 28.

Artigo em Inglês | MEDLINE | ID: mdl-34828123

RESUMO

Stochastic gradient sg-based algorithms for Markov chain Monte Carlo sampling (sgmcmc) tackle large-scale Bayesian modeling problems by operating on mini-batches and injecting noise on sgsteps. The sampling properties of these algorithms are determined by user choices, such as the covariance of the injected noise and the learning rate, and by problem-specific factors, such as assumptions on the loss landscape and the covariance of sg noise. However, current sgmcmc algorithms applied to popular complex models such as Deep Nets cannot simultaneously satisfy the assumptions on loss landscapes and on the behavior of the covariance of the sg noise, while operating with the practical requirement of non-vanishing learning rates. In this work we propose a novel practical method, which makes the sg noise isotropic, using a fixed learning rate that we determine analytically. Extensive experimental validations indicate that our proposal is competitive with the state of the art on sgmcmc.

RESUMO

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA