preprintarXiv (Cornell University)Oct 10, 2017GREEN OA

Mixed Precision Training

Indexed inarxivdatacite

Abstract

Deep neural networks have enabled progress in a wide variety of applications. Growing the size of the neural network typically results in improved accuracy. As model sizes grow, the memory and compute requirements for training these models also increases. We introduce a technique to train deep neural networks using half precision floating point numbers. In our technique, weights, activations and gradients are stored in IEEE half-precision format. Half-precision floating numbers have limited numerical range compared to single-precision numbers. We propose two techniques to handle this loss of information. Firstly, we recommend maintaining a single-precision copy of the weights that accumulates the gradients…

Citation impact

877
total citations
FWCI
Percentile
References
27
Citations per year

Authors

11

Topics & keywords

Keywords
  • Computer science
  • Artificial neural network
  • Speedup
  • Single-precision floating-point format
  • Deep learning
  • Computation
  • Deep neural networks
  • Floating point
No related works found for this paper.