TOWARDS EFFICIENT OPTIMIZATION METHODS: COMBINATORIAL OPTIMIZATION AND DEEP LEARNING-BASED ROBUST IMAGE CLASSIFICATION
Every optimization problem shares the common objective of finding a minima/maxima, but its application spans over a wide variety of fields ranging from solving NP-hard problems to training a neural network. This thesis addresses two crucial aspects of the above-mentioned fields. The first project is concerned with designing a hardware-system for efficiently solving Traveling Salesman Problem (TSP). It involves encoding the solution to the ground state of an Ising Hamiltonian and finding the minima of the energy landscape. To that end, we i) designed a stochastic nanomagnet-based device as a building block for the system, ii) developed a unique approach to encode any TSP into an array of these blocks, and finally, iii) established the operating principle to make the system converge to an optimal solution. We used this method to solve TSPs having more than 600 nodes.
The next parts of the thesis deal with another genre of optimization problems involving deep neural networks (DNN) in image-classification tasks. DNNs are trained by finding the minima of a loss landscape aimed at mapping input images to a set of discrete labels. Adversarial attacks tend to disrupt this mapping by corrupting the inputs with subtle perturbations, imperceptible to human eyes. Although it is imperative to deploy some external defense mechanisms to guard against these attacks, the defense procedure can be aided by some intrinsic robust properties of the network. In the quest for an inherently resilient neural network, we explored the robustness of biologically-inspired Spiking Neural Networks (SNN) in the second part of the thesis. We demonstrated that accuracy degradation is less severe in SNNs than in their non-spiking counterparts. We attribute this robustness to two fundamental characteristics of SNNs: (i) input discretization and (ii) leak rate in Leaky-Integrate-Fire neurons and analyze their effects.
As mentioned beforehand, this intrinsic robustness is merely an aiding tool to external defense mechanisms. Adversarial training has been established as the stat-of-the-art defense to provide significant robustness against existing attack techniques. This method redefines the boundary of the neural network by augmenting the training dataset with adversarial samples. In the process of achieving robustness, we are faced with a trade-off: a decrease in the prediction accuracy of clean or unperturbed data. The goal of the last section of my thesis is to understand this setback by using Gradient Projection-based sequential learning as an analysis tool. We systematically analyze the interplay between clean training and adversarial training on parameter subspace. In this technique, adversarial training follows clean training task where the parameter update is performed in the orthogonal direction of the previous task (clean training). It is possible to track down the principal component directions responsible for adversarial training by restricting clean and adversarial parameter update to two orthogonal subspaces. By varying the partition of subspace, we showed that the low-variance principal components are not capable of learning adversarial data, rather it is necessary to perform parameter update in a common subspace consisting of higher variance principal components to obtain significant adversarial accuracy. However, disturbing these higher variance components causes the decrease in standard clean accuracy, hence the accuracy-robustness trade-off. Further, we showed that this trade-off is worsened
when the network capacity is smaller due to under-parameterization effect.
- Doctor of Philosophy
- Electrical and Computer Engineering
- West Lafayette