Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation

Bengio, Yoshua; Léonard, Nicholas; Courville, Aaron

doi:10.48550/arxiv.1308.3432

preprintarXiv (Cornell University)Aug 15, 2013GREEN OA

Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation

YBYoshua Bengio NLNicholas Léonard ACAaron Courville

Indexed inarxivdatacite

Abstract

Stochastic neurons and hard non-linearities can be useful for a number of reasons in deep learning models, but in many cases they pose a challenging problem: how to estimate the gradient of a loss function with respect to the input of such stochastic or non-smooth neurons? I.e., can we "back-propagate" through these stochastic neurons? We examine this question, existing approaches, and compare four families of solutions, applicable in different settings. One of them is the minimum variance unbiased gradient estimator for stochatic binary neurons (a special case of the REINFORCE algorithm). A second approach, introduced here, decomposes the operation of a binary stochastic neuron into a stochastic binary part…

Citation impact

2,004

total citations

FWCI: —
Percentile: —
References: 11

Citations per year

Authors

3

Topics & keywords

Topics

Keywords

Differentiable function
Estimator
Computation
Context (archaeology)
Computer science
Binary number
Mathematics
Stochastic neural network

No related works found for this paper.