Purdue University Graduate School
Browse

Inference Engine: A high efficiency accelerator for Deep Neural Networks

Download (2.35 MB)
thesis
posted on 2021-10-12, 12:39 authored by Aliasger Tayeb ZaidyAliasger Tayeb Zaidy
Deep Neural Networks are state-of the art algorithms for various image and natural language processing tasks. These networks are composed of billions of operations working on an input to produce the desired result. Along with this computational complexity, these workloads are also massively parallel in nature. These inherent properties make deep neural networks an excellent target for custom acceleration. The main challenge faced by such accelerators is achieving a compromise between power consumption, software programmability, and resource utilization for the varied compute and data access patterns presented by DNN workloads. In this work, I present Inference Engine, a scalable and efficient DNN accelerator designed to be agnostic to the type of DNN workload. Inference Engine was designed to provide near peak hardware resource utilization, minimize data transfer, and offer a programmer friendly instruction set. Inference engine scales at the level of individually programmable clusters, each of which contains several hundred compute resources. It provides an instruction set designed to exploit parallelism within the workload while also allowing freedom for compiler based exploration of data access patterns.

History

Degree Type

  • Doctor of Philosophy

Department

  • Electrical and Computer Engineering

Campus location

  • West Lafayette

Advisor/Supervisor/Committee Chair

Eugenio Culurciello

Additional Committee Member 2

Anand Raghunathan

Additional Committee Member 3

Vijay Raghunathan

Additional Committee Member 4

Mithuna S Thottethodi