Recurrent Neural Network Regularization

Zaremba, Wojciech; Sutskever, Ilya; Vinyals, Oriol

doi:10.48550/arxiv.1409.2329

preprintarXiv (Cornell University)Sep 8, 2014GREEN OA

Recurrent Neural Network Regularization

WZWojciech Zaremba ISIlya Sutskever OVOriol Vinyals

Indexed inarxivdatacite

Abstract

We present a simple regularization technique for Recurrent Neural Networks (RNNs) with Long Short-Term Memory (LSTM) units. Dropout, the most successful technique for regularizing neural networks, does not work well with RNNs and LSTMs. In this paper, we show how to correctly apply dropout to LSTMs, and show that it substantially reduces overfitting on a variety of tasks. These tasks include language modeling, speech recognition, image caption generation, and machine translation.

Citation impact

2,281

total citations

FWCI: —
Percentile: —
References: 32

Citations per year

Authors

3

Topics & keywords

Topics

Keywords

Regularization (linguistics)
Artificial neural network
Computer science
Artificial intelligence

UN Sustainable Development Goals

Quality Education

No related works found for this paper.