Deep learning-based image deconstruction method with maintained saliency.

Fujimoto, Keisuke; Hayashi, Kojiro; Katayama, Risa; Lee, Sehyung; Liang, Zhen; Yoshida, Wako; Ishii, Shin

Fujimoto, Keisuke; Hayashi, Kojiro; Katayama, Risa; Lee, Sehyung; Liang, Zhen; Yoshida, Wako; Ishii, Shin.

Afiliação

Fujimoto K; Graduate School of Informatics, Kyoto University, Kyoto 606-8501, Japan.
Hayashi K; Graduate School of Informatics, Kyoto University, Kyoto 606-8501, Japan.
Katayama R; Graduate School of Informatics, Kyoto University, Kyoto 606-8501, Japan.
Lee S; Graduate School of Informatics, Kyoto University, Kyoto 606-8501, Japan.
Liang Z; School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen, People's Republic of China; Graduate School of Informatics, Kyoto University, Kyoto 606-8501, Japan.
Yoshida W; Graduate School of Informatics, Kyoto University, Kyoto 606-8501, Japan; Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, United Kingdom.
Ishii S; Graduate School of Informatics, Kyoto University, Kyoto 606-8501, Japan; ATR Neural Information Analysis Laboratories, Kyoto 619-0288, Japan. Electronic address: ishii@i.kyoto-u.ac.jp.

Neural Netw ; 155: 224-241, 2022 Nov.

Article em En | MEDLINE | ID: mdl-36081196

ABSTRACT

ABSTRACT

Visual properties that primarily attract bottom-up attention are collectively referred to as saliency. In this study, to understand the neural activity involved in top-down and bottom-up visual attention, we aim to prepare pairs of natural and unnatural images with common saliency. For this purpose, we propose an image transformation method based on deep neural networks that can generate new images while maintaining the consistent feature map, in particular the saliency map. This is an ill-posed problem because the transformation from an image to its corresponding feature map could be many-to-one, and in our particular case, the various images would share the same saliency map. Although stochastic image generation has the potential to solve such ill-posed problems, the most existing methods focus on adding diversity of the overall style/touch information while maintaining the naturalness of the generated images. To this end, we developed a new image transformation method that incorporates higher-dimensional latent variables so that the generated images appear unnatural with less context information but retain a high diversity of local image structures. Although such high-dimensional latent spaces are prone to collapse, we proposed a new regularization based on Kullback-Leibler divergence to avoid collapsing the latent distribution. We also conducted human experiments using our newly prepared natural and corresponding unnatural images to measure overt eye movements and functional magnetic resonance imaging, and found that those images induced distinctive neural activities related to top-down and bottom-up attentional processing.

Assuntos

Aprendizado Profundo; Humanos; Redes Neurais de Computação; Imageamento por Ressonância Magnética

Palavras-chave

Attention; Deep learning; Functional magnetic resonance imaging; Image transformation; Saliency map; Variational autoencoder

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Aprendizado Profundo Idioma: En Ano de publicação: 2022 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Aprendizado Profundo Idioma: En Ano de publicação: 2022 Tipo de documento: Article