Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks

Bengio, Samy; Vinyals, Oriol; Jaitly, Navdeep; Shazeer, Noam

doi:10.48550/arxiv.1506.03099

preprintarXiv (Cornell University)Jun 9, 2015GREEN OA

Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks

SBSamy Bengio OVOriol Vinyals NJNavdeep Jaitly NSNoam Shazeer

Google (United States)

Indexed inarxivdatacite

Abstract

Recurrent Neural Networks can be trained to produce sequences of tokens given some input, as exemplified by recent results in machine translation and image captioning. The current approach to training them consists of maximizing the likelihood of each token in the sequence given the current (recurrent) state and the previous token. At inference, the unknown previous token is then replaced by a token generated by the model itself. This discrepancy between training and inference can yield errors that can accumulate quickly along the generated sequence. We propose a curriculum learning strategy to gently change the training process from a fully guided scheme using the true previous token, towards a less guided…

Citation impact

1,264

total citations

FWCI: —
Percentile: —
References: 25

Citations per year

Authors

4

Topics & keywords

Topics

Keywords

Security token
Closed captioning
Computer science
Inference
Sequence (biology)
Scheme (mathematics)
Artificial neural network
Artificial intelligence

UN Sustainable Development Goals

Quality Education

No related works found for this paper.