EIE: Efficient Inference Engine on Compressed Deep Neural Network

Han, Song; Liu, Xingyu; Mao, Huizi; Pu, Jing; Pedram, Ardavan; Horowitz, Mark; Dally, William J.

doi:10.1109/isca.2016.30

preprintJun 1, 2016Closed access

EIE: Efficient Inference Engine on Compressed Deep Neural Network

SHSong Han XLXingyu Liu HMHuizi Mao JPJing Pu APArdavan Pedram

Stanford University

Indexed incrossref

Abstract

State-of-the-art deep neural networks (DNNs) have hundreds of millions of connections and are both computationally and memory intensive, making them difficult to deploy on embedded systems with limited hardware resources and power budgets. While custom hardware helps the computation, fetching weights from DRAM is two orders of magnitude more expensive than ALU operations, and dominates the required power. Previously proposed 'Deep Compression' makes it possible to fit large DNNs (AlexNet and VGGNet) fully in on-chip SRAM. This compression is achieved by pruning the redundant connections and having multiple connections share the same weight. We propose an energy efficient inference engine (EIE) that performs…

Citation impact

965

total citations

FWCI: 71.97
Percentile: 100%
References: 73

Citations per year

Authors

7

Topics & keywords

Topics

Keywords

Computer science
Uncompressed video
Parallel computing
Dram
Inference engine
Artificial neural network
Inference
Static random-access memory

UN Sustainable Development Goals

Affordable and clean energy

No related works found for this paper.