EIE

Han, Song; Liu, Xingyu; Mao, Huizi; Pu, Jing; Pedram, Ardavan; Horowitz, Mark; Dally, William J.

doi:10.1145/3007787.3001163

articleACM SIGARCH Computer Architecture NewsJun 18, 2016Closed access

EIE

SHSong Han XLXingyu Liu HMHuizi Mao JPJing Pu APArdavan Pedram

Stanford University · Nvidia (United Kingdom)

Indexed incrossref

Abstract

State-of-the-art deep neural networks (DNNs) have hundreds of millions of connections and are both computationally and memory intensive, making them difficult to deploy on embedded systems with limited hardware resources and power budgets. While custom hardware helps the computation, fetching weights from DRAM is two orders of magnitude more expensive than ALU operations, and dominates the required power. Previously proposed 'Deep Compression' makes it possible to fit large DNNs (AlexNet and VGGNet) fully in on-chip SRAM. This compression is achieved by pruning the redundant connections and having multiple connections share the same weight. We propose an energy efficient inference engine (EIE) that performs…

Citation impact

2,031

total citations

FWCI: 132.78
Percentile: 100%
References: 55

Citations per year

Authors

7

Topics & keywords

Topics

Keywords

Computer science
Uncompressed video
Parallel computing
Dram
Static random-access memory
Matrix multiplication
Throughput
Computer engineering

UN Sustainable Development Goals

Affordable and clean energy

No related works found for this paper.