book chapterJan 12, 2022GOLD OA
A Survey of Quantization Methods for Efficient Neural Network Inference
University of California, Berkeley
Indexed incrossref
Abstract
This chapter provides approaches to the problem of quantizing the numerical values in deep Neural Network computations, covering the advantages/disadvantages of current methods. Over the past decade, people have observed significant improvements in the accuracy of Neural Networks (NNs) for a wide range of problems, often achieved by highly over-parameterized models. Achieving efficient, real-time NNs with optimal accuracy requires rethinking the design, training, and deployment of NN models. Model distillation involves training a large model and then using it as a teacher to train a more compact model. Loosely related to NN quantization is work in neuroscience that suggests that the human brain stores…
Citation impact
1,021
total citations
- FWCI
- 125.75
- Percentile
- 100%
- References
- 421
Citations per year
Authors
6Topics & keywords
Topics
Keywords
- Artificial neural network
- Inference
- Computer science
- Quantization (signal processing)
- Artificial intelligence
- Algorithm
UN Sustainable Development Goals
- Quality Education
No related works found for this paper.