Your browser doesn't support javascript.
loading
Teaching Masked Autoencoder With Strong Augmentations.
Article em En | MEDLINE | ID: mdl-38980783
ABSTRACT
Masked autoencoder (MAE) has been regarded as a capable self-supervised learner for various downstream tasks. Nevertheless, the model still lacks high-level discriminability, which results in poor linear probing performance. In view of the fact that strong augmentation plays an essential role in contrastive learning, can we capitalize on strong augmentation in MAE? The difficulty originates from the pixel uncertainty caused by strong augmentation that may affect the reconstruction, and thus, directly introducing strong augmentation into MAE often hurts the performance. In this article, we delve into the potential of strong augmented views to enhance MAE while maintaining MAE's advantages. To this end, we propose a simple yet effective masked Siamese autoencoder (MSA) model, which consists of a student branch and a teacher branch. The student branch derives MAE's advanced architecture, and the teacher branch treats the unmasked strong view as an exemplary teacher to impose high-level discrimination onto the student branch. We demonstrate that our MSA can improve the model's spatial perception capability and, therefore, globally favors interimage discrimination. Empirical evidence shows that the model pretrained by MSA provides superior performances across different downstream tasks. Notably, linear probing performance on frozen features extracted from MSA leads to 6.1% gains over MAE on ImageNet-1k. Fine-tuning (FT) the network on VQAv2 task finally achieves 67.4% accuracy, outperforming 1.6% of the supervised method DeiT and 1.2% of MAE. Codes and models are available at https//github.com/KimSoybean/MSA.

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: IEEE Trans Neural Netw Learn Syst Ano de publicação: 2024 Tipo de documento: Article País de publicação: EEUU / ESTADOS UNIDOS / ESTADOS UNIDOS DA AMERICA / EUA / UNITED STATES / UNITED STATES OF AMERICA / US / USA

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: IEEE Trans Neural Netw Learn Syst Ano de publicação: 2024 Tipo de documento: Article País de publicação: EEUU / ESTADOS UNIDOS / ESTADOS UNIDOS DA AMERICA / EUA / UNITED STATES / UNITED STATES OF AMERICA / US / USA