preprintJun 1, 2016Closed access

EIE: Efficient Inference Engine on Compressed Deep Neural Network

Stanford University

Indexed incrossref

Abstract

State-of-the-art deep neural networks (DNNs) have hundreds of millions of connections and are both computationally and memory intensive, making them difficult to deploy on embedded systems with limited hardware resources and power budgets. While custom hardware helps the computation, fetching weights from DRAM is two orders of magnitude more expensive than ALU operations, and dominates the required power. Previously proposed 'Deep Compression' makes it possible to fit large DNNs (AlexNet and VGGNet) fully in on-chip SRAM. This compression is achieved by pruning the redundant connections and having multiple connections share the same weight. We propose an energy efficient inference engine (EIE) that performs…

Citation impact

965
total citations
FWCI
71.97
Percentile
100%
References
73
Citations per year

Authors

7

Topics & keywords

Keywords
  • Computer science
  • Uncompressed video
  • Parallel computing
  • Dram
  • Inference engine
  • Artificial neural network
  • Inference
  • Static random-access memory
UN Sustainable Development Goals
  • Affordable and clean energy
No related works found for this paper.