Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation
Indexed inarxivdatacite
Abstract
Stochastic neurons and hard non-linearities can be useful for a number of reasons in deep learning models, but in many cases they pose a challenging problem: how to estimate the gradient of a loss function with respect to the input of such stochastic or non-smooth neurons? I.e., can we "back-propagate" through these stochastic neurons? We examine this question, existing approaches, and compare four families of solutions, applicable in different settings. One of them is the minimum variance unbiased gradient estimator for stochatic binary neurons (a special case of the REINFORCE algorithm). A second approach, introduced here, decomposes the operation of a binary stochastic neuron into a stochastic binary part…
Citation impact
2,004
total citations
- FWCI
- —
- Percentile
- —
- References
- 11
Citations per year
Authors
3Topics & keywords
Topics
Keywords
- Differentiable function
- Estimator
- Computation
- Context (archaeology)
- Computer science
- Binary number
- Mathematics
- Stochastic neural network
No related works found for this paper.