ADVERSARIAL LEARNING ON ROBUSTNESS AND GENERATIVE MODELS
thesisposted on 03.08.2021, 15:26 authored by Qingyi GaoQingyi Gao
In this dissertation, we study two important problems in the area of modern deep learning: adversarial robustness and adversarial generative model. In the first part, we study the generalization performance of deep neural networks (DNNs) in adversarial learning. Recent studies have shown that many machine learning models are vulnerable to adversarial attacks, but much remains unknown concerning its generalization error in this scenario. We focus on the $\ell_\infty$ adversarial attacks produced under the fast gradient sign method (FGSM). We establish a tight bound for the adversarial Rademacher complexity of DNNs based on both spectral norms and ranks of weight matrices. The spectral norm and rank constraints imply that this class of networks can be realized as a subset of the class of a shallow network composed with a low dimensional Lipschitz continuous function. This crucial observation leads to a bound that improves the dependence on the network width compared to previous works and achieves depth independence. We show that adversarial Rademacher complexity is always larger than its natural counterpart, but the effect of adversarial perturbations can be limited under our weight normalization framework.
In the second part, we study deep generative models that receive great success in many fields. It is well-known that the complex data usually does not populate its ambient Euclidean space but resides in a lower-dimensional manifold instead. Thus, misspecifying the latent dimension in generative models will result in a mismatch of latent representations and poor generative qualities. To address these problems, we propose a novel framework called Latent Wasserstein GAN (LWGAN) to fuse the auto-encoder and WGAN such that the intrinsic dimension of data manifold can be adaptively learned by an informative latent distribution. In particular, we show that there exist an encoder network and a generator network in such a way that the intrinsic dimension of the learned encodes distribution is equal to the dimension of the data manifold. Theoretically, we prove the consistency of the estimation for the intrinsic dimension of the data manifold and derive a generalization error bound for LWGAN. Comprehensive empirical experiments verify our framework and show that LWGAN is able to identify the correct intrinsic dimension under several scenarios, and simultaneously generate high-quality synthetic data by samples from the learned latent distribution.