Inference Engine: A high efficiency accelerator for Deep Neural Networks

Zaidy, Aliasger Tayeb

doi:10.25394/PGS.9108539.v1

Aliasger_Zaidy_Thesis.pdf (2.35 MB)

Inference Engine: A high efficiency accelerator for Deep Neural Networks

thesis

posted on 2021-10-12, 12:39 authored by Aliasger Tayeb ZaidyAliasger Tayeb Zaidy

Deep Neural Networks are state-of the art algorithms for various image and natural language processing tasks. These networks are composed of billions of operations working on an input to produce the desired result. Along with this computational complexity, these workloads are also massively parallel in nature. These inherent properties make deep neural networks an excellent target for custom acceleration. The main challenge faced by such accelerators is achieving a compromise between power consumption, software programmability, and resource utilization for the varied compute and data access patterns presented by DNN workloads. In this work, I present Inference Engine, a scalable and efficient DNN accelerator designed to be agnostic to the type of DNN workload. Inference Engine was designed to provide near peak hardware resource utilization, minimize data transfer, and offer a programmer friendly instruction set. Inference engine scales at the level of individually programmable clusters, each of which contains several hundred compute resources. It provides an instruction set designed to exploit parallelism within the workload while also allowing freedom for compiler based exploration of data access patterns.

History

Degree Type

Doctor of Philosophy

Department

Electrical and Computer Engineering

Campus location

West Lafayette

Advisor/Supervisor/Committee Chair

Eugenio Culurciello

Additional Committee Member 2

Anand Raghunathan

Additional Committee Member 3

Vijay Raghunathan

Additional Committee Member 4

Mithuna S Thottethodi

Usage metrics

Keywords

hardware accelerator Computer architecture Deep neural network Computer Engineering Computer System Architecture

Licence

CC BY 4.0

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Inference Engine: A high efficiency accelerator for Deep Neural Networks

History

Degree Type

Department

Campus location

Advisor/Supervisor/Committee Chair

Additional Committee Member 2

Additional Committee Member 3

Additional Committee Member 4

Usage metrics

Categories

Keywords

Licence

Exports