articleOct 1, 2019Closed access

HAWQ: Hessian AWare Quantization of Neural Networks With Mixed-Precision

University of California, Berkeley

Indexed incrossref

Abstract

Model size and inference speed/power have become a major challenge in the deployment of neural networks for many applications. A promising approach to address these problems is quantization. However, uniformly quantizing a model to ultra-low precision leads to significant accuracy degradation. A novel solution for this is to use mixed-precision quantization, as some parts of the network may allow lower precision as compared to other layers. However, there is no systematic way to determine the precision of different layers. A brute force approach is not feasible for deep networks, as the search space for mixed-precision is exponential in the number of layers. Another challenge is a similar factorial complexity…

Citation impact

452
total citations
FWCI
20.11
Percentile
100%
References
89
Citations per year

Authors

5

Topics & keywords

Keywords
  • Quantization (signal processing)
  • Computer science
  • Hessian matrix
  • Algorithm
  • Artificial neural network
  • Inference
  • Artificial intelligence
  • Mathematics
No related works found for this paper.