Mixed Precision Training
Indexed inarxivdatacite
Abstract
Deep neural networks have enabled progress in a wide variety of applications. Growing the size of the neural network typically results in improved accuracy. As model sizes grow, the memory and compute requirements for training these models also increases. We introduce a technique to train deep neural networks using half precision floating point numbers. In our technique, weights, activations and gradients are stored in IEEE half-precision format. Half-precision floating numbers have limited numerical range compared to single-precision numbers. We propose two techniques to handle this loss of information. Firstly, we recommend maintaining a single-precision copy of the weights that accumulates the gradients…
Citation impact
877
total citations
- FWCI
- —
- Percentile
- —
- References
- 27
Citations per year
Authors
11Topics & keywords
Topics
Keywords
- Computer science
- Artificial neural network
- Speedup
- Single-precision floating-point format
- Deep learning
- Computation
- Deep neural networks
- Floating point
No related works found for this paper.