Convolutional Neural Networks for Speech Recognition

Abdel‐Hamid, Ossama; Mohamed, Abdelrahman; Jiang, Hui; Deng, Li; Penn, Gerald; Yu, Dong

doi:10.1109/taslp.2014.2339736

articleIEEE/ACM Transactions on Audio Speech and Language ProcessingJul 16, 2014Closed access

Convolutional Neural Networks for Speech Recognition

OAOssama Abdel‐Hamid AMAbdelrahman Mohamed HJHui Jiang LDLi Deng GPGerald Penn

York University · University of Toronto · +1 more institution

Indexed incrossref

Abstract

Recently, the hybrid deep neural network (DNN)-hidden Markov model (HMM) has been shown to significantly improve speech recognition performance over the conventional Gaussian mixture model (GMM)-HMM. The performance improvement is partially attributed to the ability of the DNN to model complex correlations in speech features. In this paper, we show that further error rate reduction can be obtained by using convolutional neural networks (CNNs). We first present a concise description of the basic CNN and explain how it can be used for speech recognition. We further propose a limited-weight-sharing scheme that can better model speech features. The special structure such as local connectivity, weight sharing, and…

Citation impact

2,278

total citations

FWCI: 128.57
Percentile: 100%
References: 49

Citations per year

Authors

6

Topics & keywords

Topics

Keywords

TIMIT
Computer science
Speech recognition
Hidden Markov model
Word error rate
Pooling
Convolutional neural network
Artificial intelligence

UN Sustainable Development Goals

Quality Education

No related works found for this paper.