Why Does Unsupervised Pre-training Help Deep Learning?

Erhan, Dumitru; Courville, Aaron; Bengio, Yoshua; Vincent, Pascal

articleMar 1, 2010Closed access

Why Does Unsupervised Pre-training Help Deep Learning?

DEDumitru Erhan ACAaron Courville YBYoshua Bengio PVPascal Vincent

Abstract

Much recent research has been devoted to learning algorithms for deep architectures such as Deep Belief Networks and stacks of autoencoder variants with impressive results being obtained in several areas, mostly on vision and language datasets. The best results obtained on supervised learning tasks often involve an unsupervised learning component, usually in an unsupervised pre-training phase. The main question investigated here is the following: why does unsupervised pre-training work so well? Through extensive experimentation, we explore several possible explanations discussed in the literature including its action as a regularizer (Erhan et al., 2009b) and as an aid to optimization (Bengio et al., 2007).…

Citation impact

2,114

total citations

FWCI: 65.76
Percentile: 100%
References: 55

Citations per year

Authors

4

Topics & keywords

Topics

Keywords

Artificial intelligence
Unsupervised learning
Computer science
Machine learning
Deep learning
Regularization (linguistics)
Generalization
Autoencoder

No related works found for this paper.