Purdue University Graduate School
Browse
- No file added yet -

Sparse Deep Learning and Stochastic Neural Network

Download (1.1 MB)
thesis
posted on 2022-07-22, 17:20 authored by Yan SunYan Sun

Deep learning has achieved state-of-the-art performance on many machine learning tasks. But the deep neural network(DNN) model still suffers a few issues. Over-parametrized neural network generally has better optimization landscape, but it is computationally expensive, hard to interpret and the model usually can not correctly quantify the prediction uncertainty. On the other hand, small DNN model could suffer from local trap and will be hard to optimize. In this dissertation, we tackle these issues from two directions, sparse deep learning and stochastic neural network. 


For sparse deep learning, we proposed Bayesian neural network(BNN) model with mixture of normal prior. Theoretically, We established the posterior consistency and structure selection consistency, which ensures the sparse DNN model can be consistently identified. We also demonstrate the asymptotic normality of the prediction, which ensures the prediction uncertainty to be correctly quantified. Computationally, we proposed a prior annealing approach to optimize the posterior of BNN. The proposed methods share similar computation complexity to the standard stochastic gradient descent method for training DNN. Experiment results show that our model performs well on high dimensional variable selection as well as neural network pruning.


For stochastic neural network, we proposed a Kernel-Expanded Stochastic Neural Network model or K-StoNet model in short. We reformulate the DNN as a latent variable model and incorporate support vector regression (SVR) as the first hidden layer. The latent variable formulation breaks the training into a series of convex optimization problems and the model can be easily trained using the imputation-regularized optimization (IRO) algorithm. We provide theoretical guarantee for convergence of the algorithm and the prediction uncertainty quantification. Experiment results show that the proposed model can achieve good prediction performance and provide correct confidence region for prediction. 

History

Degree Type

  • Doctor of Philosophy

Department

  • Statistics

Campus location

  • West Lafayette

Advisor/Supervisor/Committee Chair

Faming Liang

Additional Committee Member 2

Xiao Wang

Additional Committee Member 3

Chuanhai Liu

Additional Committee Member 4

Vinayak Rao

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC