Attentive Learning Facilitates Generalization of Neural Networks.
IEEE Trans Neural Netw Learn Syst
; PP2024 Feb 07.
Article
en En
| MEDLINE
| ID: mdl-38324433
ABSTRACT
This article studies the generalization of neural networks (NNs) by examining how a network changes when trained on a training sample with or without out-of-distribution (OoD) examples. If the network's predictions are less influenced by fitting OoD examples, then the network learns attentively from the clean training set. A new notion, dataset-distraction stability, is proposed to measure the influence. Extensive CIFAR-10/100 experiments on the different VGG, ResNet, WideResNet, ViT architectures, and optimizers show a negative correlation between the dataset-distraction stability and generalizability. With the distraction stability, we decompose the learning process on the training set S into multiple learning processes on the subsets of S drawn from simpler distributions, i.e., distributions of smaller intrinsic dimensions (IDs), and furthermore, a tighter generalization bound is derived. Through attentive learning, miraculous generalization in deep learning can be explained and novel algorithms can also be designed.
Texto completo:
1
Colección:
01-internacional
Base de datos:
MEDLINE
Idioma:
En
Revista:
IEEE Trans Neural Netw Learn Syst
Año:
2024
Tipo del documento:
Article
Pais de publicación:
Estados Unidos