Convolutional Neural Networks for Speech Recognition
York University · University of Toronto · +1 more institution
Abstract
Recently, the hybrid deep neural network (DNN)-hidden Markov model (HMM) has been shown to significantly improve speech recognition performance over the conventional Gaussian mixture model (GMM)-HMM. The performance improvement is partially attributed to the ability of the DNN to model complex correlations in speech features. In this paper, we show that further error rate reduction can be obtained by using convolutional neural networks (CNNs). We first present a concise description of the basic CNN and explain how it can be used for speech recognition. We further propose a limited-weight-sharing scheme that can better model speech features. The special structure such as local connectivity, weight sharing, and…
Citation impact
- FWCI
- 128.57
- Percentile
- 100%
- References
- 49
Authors
6Topics & keywords
- TIMIT
- Computer science
- Speech recognition
- Hidden Markov model
- Word error rate
- Pooling
- Convolutional neural network
- Artificial intelligence
- Quality Education