A Simple Way to Initialize Recurrent Networks of Rectified Linear Units

Le, Quoc V.; Jaitly, Navdeep; Hinton, Geoffrey E.

doi:10.48550/arxiv.1504.00941

preprintarXiv (Cornell University)Apr 3, 2015GREEN OA

A Simple Way to Initialize Recurrent Networks of Rectified Linear Units

QVQuoc V. Le NJNavdeep Jaitly GEGeoffrey E. Hinton

Indexed inarxivdatacite

Abstract

Learning long term dependencies in recurrent networks is difficult due to vanishing and exploding gradients. To overcome this difficulty, researchers have developed sophisticated optimization techniques and network architectures. In this paper, we propose a simpler solution that use recurrent neural networks composed of rectified linear units. Key to our solution is the use of the identity matrix or its scaled version to initialize the recurrent weight matrix. We find that our solution is comparable to LSTM on our four benchmarks: two toy problems involving long-range temporal structures, a large language modeling problem and a benchmark speech recognition problem.

Citation impact

555

total citations

FWCI: —
Percentile: —
References: 32

Citations per year

Authors

3

Topics & keywords

Topics

Keywords

Benchmark (surveying)
Recurrent neural network
Computer science
Simple (philosophy)
Range (aeronautics)
Matrix (chemical analysis)
Key (lock)
Artificial intelligence

No related works found for this paper.