Hybrid speech recognition with Deep Bidirectional LSTM

Graves, Alex; Jaitly, Navdeep; Mohamed, Abdelrahman

doi:10.1109/asru.2013.6707742

articleDec 1, 2013Closed access

Hybrid speech recognition with Deep Bidirectional LSTM

AGAlex Graves NJNavdeep Jaitly AMAbdelrahman Mohamed

University of Toronto

Indexed incrossref

Abstract

Deep Bidirectional LSTM (DBLSTM) recurrent neural networks have recently been shown to give state-of-the-art performance on the TIMIT speech database. However, the results in that work relied on recurrent-neural-network-specific objective functions, which are difficult to integrate with existing large vocabulary speech recognition systems. This paper investigates the use of DBLSTM as an acoustic model in a standard neural network-HMM hybrid system. We find that a DBLSTM-HMM hybrid gives equally good results on TIMIT as the previous work. It also outperforms both GMM and deep network benchmarks on a subset of the Wall Street Journal corpus. However the improvement in word error rate over the deep network is…

Citation impact

1,786

total citations

FWCI: 84.99
Percentile: 100%
References: 25

Citations per year

Authors

3

Topics & keywords

Topics

Keywords

TIMIT
Computer science
Word error rate
Speech recognition
Leverage (statistics)
Hidden Markov model
Artificial neural network
Recurrent neural network

No related works found for this paper.