Improving neural networks by preventing co-adaptation of feature detectors

Hinton, Geoffrey E.; Srivastava, Nitish; Krizhevsky, Alex; Sutskever, Ilya; Salakhutdinov, Ruslan

doi:10.48550/arxiv.1207.0580

preprintarXiv (Cornell University)Jul 3, 2012GREEN OA

Improving neural networks by preventing co-adaptation of feature detectors

GEGeoffrey E. Hinton NSNitish Srivastava AKAlex Krizhevsky ISIlya Sutskever RSRuslan Salakhutdinov

University of Toronto

Indexed inarxivdatacite

Abstract

When a large feedforward neural network is trained on a small training set, it typically performs poorly on held-out test data. This "overfitting" is greatly reduced by randomly omitting half of the feature detectors on each training case. This prevents complex co-adaptations in which a feature detector is only helpful in the context of several other specific feature detectors. Instead, each neuron learns to detect a feature that is generally helpful for producing the correct answer given the combinatorially large variety of internal contexts in which it must operate. Random "dropout" gives big improvements on many benchmark tasks and sets new records for speech and object recognition.

Citation impact

6,651

total citations

FWCI: —
Percentile: —
References: 18

Citations per year

Authors

5

Topics & keywords

Topics

Keywords

Overfitting
Feature (linguistics)
Dropout (neural networks)
Computer science
Benchmark (surveying)
Adaptation (eye)
Context (archaeology)
Detector

No related works found for this paper.