HAWQ: Hessian AWare Quantization of Neural Networks With Mixed-Precision

Dong, Zhen; Yao, Zhewei; Gholami, Amir; Mahoney, Michael W.; Keutzer, Kurt

doi:10.1109/iccv.2019.00038

articleOct 1, 2019Closed access

HAWQ: Hessian AWare Quantization of Neural Networks With Mixed-Precision

ZDZhen Dong ZYZhewei Yao AGAmir Gholami MWMichael W. Mahoney KKKurt Keutzer

University of California, Berkeley

Indexed incrossref

Abstract

Model size and inference speed/power have become a major challenge in the deployment of neural networks for many applications. A promising approach to address these problems is quantization. However, uniformly quantizing a model to ultra-low precision leads to significant accuracy degradation. A novel solution for this is to use mixed-precision quantization, as some parts of the network may allow lower precision as compared to other layers. However, there is no systematic way to determine the precision of different layers. A brute force approach is not feasible for deep networks, as the search space for mixed-precision is exponential in the number of layers. Another challenge is a similar factorial complexity…

Citation impact

452

total citations

FWCI: 20.11
Percentile: 100%
References: 89

Citations per year

Authors

5

Topics & keywords

Topics

Keywords

Quantization (signal processing)
Computer science
Hessian matrix
Algorithm
Artificial neural network
Inference
Artificial intelligence
Mathematics

No related works found for this paper.