File(s) under embargo
4
month(s)12
day(s)until file(s) become available
A SYSTEMATIC STUDY OF SPARSE DEEP LEARNING WITH DIFFERENT PENALTIES
Deep learning has been the driving force behind many successful data science achievements. However, the deep neural network (DNN) that forms the basis of deep learning is
often over-parameterized, leading to training, prediction, and interpretation challenges. To
address this issue, it is common practice to apply an appropriate penalty to each connection
weight, limiting its magnitude. This approach is equivalent to imposing a prior distribution
on each connection weight from a Bayesian perspective. This project offers a systematic investigation into the selection of the penalty function or prior distribution. Specifically, under
the general theoretical framework of posterior consistency, we prove that consistent sparse
deep learning can be achieved with a variety of penalty functions or prior distributions.
Examples include amenable regularization penalties (such as MCP and SCAD), spike-and?slab priors (such as mixture Gaussian distribution and mixture Laplace distribution), and
polynomial decayed priors (such as the student-t distribution). Our theory is supported by
numerical results.
History
Degree Type
- Doctor of Philosophy
Department
- Statistics
Campus location
- West Lafayette