Pruning Convolution Neural Network (SqueezeNet) for Efficient Hardware Deployment

Gaikwad, Akash

doi:10.25394/PGS.7418480.v1

Thesis__Pruning_Convolution_Neural_Network__SqueezeNet__for_Efficient_Hardware_Deployment_.pdf (4.68 MB)

Pruning Convolution Neural Network (SqueezeNet) for Efficient Hardware Deployment

thesis

posted on 2019-01-17, 03:13 authored by Akash GaikwadAkash Gaikwad

In recent years, deep learning models have become popular in the real-time embedded application, but there are many complexities for hardware deployment because of limited resources such as memory, computational power, and energy. Recent research in the field of deep learning focuses on reducing the model size of the Convolution Neural Network (CNN) by various compression techniques like Architectural compression, Pruning, Quantization, and Encoding (e.g., Huffman encoding). Network pruning is one of the promising technique to solve these problems.

This thesis proposes methods to prune the convolution neural network (SqueezeNet) without introducing network sparsity in the pruned model.

This thesis proposes three methods to prune the CNN to decrease the model size of CNN without a significant drop in the accuracy of the model.

1: Pruning based on Taylor expansion of change in cost function Delta C.

2: Pruning based on L₂ normalization of activation maps.

3: Pruning based on a combination of method 1 and method 2.

The proposed methods use various ranking methods to rank the convolution kernels and prune the lower ranked filters afterwards SqueezeNet model is fine-tuned by backpropagation. Transfer learning technique is used to train the SqueezeNet on the CIFAR-10 dataset. Results show that the proposed approach reduces the SqueezeNet model by 72% without a significant drop in the accuracy of the model (optimal pruning efficiency result). Results also show that Pruning based on a combination of Taylor expansion of the cost function and L₂ normalization of activation maps achieves better pruning efficiency compared to other individual pruning criteria and most of the pruned kernels are from mid and high-level layers. The Pruned model is deployed on BlueBox 2.0 using RTMaps software and model performance was evaluated.

History

Degree Type

Master of Science in Electrical and Computer Engineering

Department

Electrical and Computer Engineering

Campus location

Indianapolis

Advisor/Supervisor/Committee Chair

Dr. Mohamed El-Sharkawy

Additional Committee Member 2

Dr. Maher Rizkalla

Additional Committee Member 3

Dr. Brian King

Usage metrics

Keywords

Convolution neural network CNN SqueezeNet Pruning L2 Normalization CIFAR-10 Transfer learning Fine Pruning Coarse pruning S32V234 activation maps Taylor expansion RTMaps BlueBox Model Compression Computer Engineering

Licence

CC BY 4.0

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Pruning Convolution Neural Network (SqueezeNet) for Efficient Hardware Deployment

History

Degree Type

Department

Campus location

Advisor/Supervisor/Committee Chair

Additional Committee Member 2

Additional Committee Member 3

Usage metrics

Categories

Keywords

Licence

Exports