Smoothness regularization is a popular method to decrease generalization error. We propose a novel regularization technique that rewards local distributional smoothness (LDS), a KL-distance based measure of the model's robustness against perturbation. The LDS is defined in terms of the direction to which the model distribution is most sensitive in the input space. We call the training with LDS regularization the virtual adversarial training (VAT). Our technique resembles the adversarial training, but distinguishes itself in that it determines the adversarial direction from the model distribution alone, and does not use the label information. The technique is therefore applicable even to semi-supervised learning. When we applied our technique to the classification task of the permutation invariant MNIST dataset, it not only eclipsed all the models that are not dependent on generative models and pre-training, but also performed well even in comparison to the state of the art method that uses a highly advanced generative model.
Added 3 years ago by Hugo Larochelle