Translational symmetry in convolutions with localized kernels causes an implicit bias toward high frequency adversarial examples.

Caro, Josue O; Ju, Yilong; Pyle, Ryan; Dey, Sourav; Brendel, Wieland; Anselmi, Fabio; Patel, Ankit B

Caro, Josue O; Ju, Yilong; Pyle, Ryan; Dey, Sourav; Brendel, Wieland; Anselmi, Fabio; Patel, Ankit B.

Afiliação

Caro JO; Department of Neuroscience, Baylor College of Medicine, Houston, TX, United States.
Ju Y; Department of Neuroscience, Baylor College of Medicine, Houston, TX, United States.
Pyle R; Department of Electrical and Computer Engineering, Rice University, Houston, TX, United States.
Dey S; Department of Neuroscience, Baylor College of Medicine, Houston, TX, United States.
Brendel W; Department of Electrical and Computer Engineering, Rice University, Houston, TX, United States.
Anselmi F; Manifold AI, San Francisco, CA, United States.
Patel AB; Max Planck Institute for Intelligent Systems, University of Tübingen, Tübingen, Germany.

Front Comput Neurosci ; 18: 1387077, 2024.

Article em En | MEDLINE | ID: mdl-38966128

ABSTRACT

ABSTRACT

Adversarial attacks are still a significant challenge for neural networks. Recent efforts have shown that adversarial perturbations typically contain high-frequency features, but the root cause of this phenomenon remains unknown. Inspired by theoretical work on linear convolutional models, we hypothesize that translational symmetry in convolutional operations together with localized kernels implicitly bias the learning of high-frequency features, and that this is one of the main causes of high frequency adversarial examples. To test this hypothesis, we analyzed the impact of different choices of linear and non-linear architectures on the implicit bias of the learned features and adversarial perturbations, in spatial and frequency domains. We find that, independently of the training dataset, convolutional operations have higher frequency adversarial attacks compared to other architectural parameterizations, and that this phenomenon is exacerbated with stronger locality of the kernel (kernel size) end depth of the model. The explanation for the kernel size dependence involves the Fourier Uncertainty Principle a spatially-limited filter (local kernel in the space domain) cannot also be frequency-limited (local in the frequency domain). Using larger convolution kernel sizes or avoiding convolutions (e.g., by using Vision Transformers or MLP-style architectures) significantly reduces this high-frequency bias. Looking forward, our work strongly suggests that understanding and controlling the implicit bias of architectures will be essential for achieving adversarial robustness.

Palavras-chave

Uncertainty Principle; adversarial examples; convolutional architectures; implicit regularization; neural networks

Texto completo

Adicionar na Minha BVS

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: Front Comput Neurosci Ano de publicação: 2024 Tipo de documento: Article País de afiliação: Estados Unidos

Texto completo

Adicionar na Minha BVS

Imprimir

XML

PubMed Links

Buscar no Google