Understanding the difficulty of training deep feedforward neural networks

Glorot, Xavier; Bengio, Yoshua

articleJan 1, 2010Closed access

Understanding the difficulty of training deep feedforward neural networks

Abstract

Whereas before 2006 it appears that deep multilayer neural networks were not successfully trained, since then several algorithms have been shown to successfully train them, with experimental results showing the superiority of deeper vs less deep architectures. All these experimental results were obtained with new initialization or training mechanisms. Our objective here is to understand better why standard gradient descent from random initialization is doing so poorly with deep neural networks, to better understand these recent relative successes and help design better algorithms in the future. We first observe the influence of the non-linear activations functions. We find that the logistic sigmoid activation…

Citation impact

12,677

total citations

FWCI: 30.29
Percentile: 100%
References: 17

Citations per year

Authors

2

Topics & keywords

Topics

Keywords

Initialization
Computer science
Artificial neural network
Artificial intelligence
Deep neural networks
Deep learning
Gradient descent
Jacobian matrix and determinant

UN Sustainable Development Goals

Quality Education

No related works found for this paper.