RESUMO
Various methods have been proposed to defend against adversarial attacks. However, there is a lack of enough theoretical guarantee of the performance, thus leading to two problems: First, deficiency of necessary adversarial training samples might attenuate the normal gradient's back-propagation, which leads to overfitting and gradient masking potentially. Second, point-wise adversarial sampling offers an insufficient support region for adversarial data and thus cannot form a robust decision-boundary. To solve these issues, we provide a theoretical analysis to reveal the relationship between robust accuracy and the complexity of the training set in adversarial training. As a result, we propose a novel training scheme called Variational Adversarial Defense. Based on the distribution of adversarial samples, this novel construction upgrades the defend scheme from local point-wise to distribution-wise, yielding an enlarged support region for safeguarding robust training, thus possessing a higher promising to defense attacks. The proposed method features the following advantages: 1) Instead of seeking adversarial examples point-by-point (in a sequential way), we draw diverse adversarial examples from the inferred distribution; and 2) Augmenting the training set by a larger support region consolidates the smoothness of the decision boundary. Finally, the proposed method is analyzed via the Taylor expansion technique, which casts our solution with natural interpretability.