Abstract

Many real-world sequence learning tasks require the prediction of sequences of labels from noisy, unsegmented input data. In speech recognition, for example, an acoustic signal is transcribed into words or sub-word units. Recurrent neural networks (RNNs) are powerful sequence learners that would seem well suited to such tasks. However, because they require pre-segmented training data, and post-processing to transform their outputs into label sequences, their applicability has so far been limited. This paper presents a novel method for training RNNs to label unsegmented sequences directly, thereby solving both problems. An experiment on the TIMIT speech corpus demonstrates its advantages over both a baseline…

Citation impact

5,434
total citations
FWCI
9.79
Percentile
100%
References
15
Citations per year

Authors

4

Topics & keywords

Keywords
  • TIMIT
  • Computer science
  • Speech recognition
  • Recurrent neural network
  • Connectionism
  • Hidden Markov model
  • Sequence (biology)
  • Artificial intelligence
UN Sustainable Development Goals
  • Quality Education
No related works found for this paper.

Funding