Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups

Hinton, Geoffrey E.; Deng, Li; Yu, Dong; Dahl, George E.; Mohamed, Abdelrahman; Jaitly, Navdeep; Senior, Andrew; Vanhoucke, Vincent; Nguyen, Patrick; Sainath, Tara N.; Kingsbury, Brian

doi:10.1109/msp.2012.2205597

articleIEEE Signal Processing MagazineOct 19, 2012Closed access

Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups

GEGeoffrey E. Hinton LDLi Deng DYDong Yu GEGeorge E. Dahl AMAbdelrahman Mohamed

University of Toronto · University of Waterloo · +4 more institutions

Indexed incrossref

Abstract

Most current speech recognition systems use hidden Markov models (HMMs) to deal with the temporal variability of speech and Gaussian mixture models (GMMs) to determine how well each state of each HMM fits a frame or a short window of frames of coefficients that represents the acoustic input. An alternative way to evaluate the fit is to use a feed-forward neural network that takes several frames of coefficients as input and produces posterior probabilities over HMM states as output. Deep neural networks (DNNs) that have many hidden layers and are trained using new methods have been shown to outperform GMMs on a variety of speech recognition benchmarks, sometimes by a large margin. This article provides an…

Citation impact

10,279

total citations

FWCI: 411.54
Percentile: 100%
References: 86

Citations per year

Authors

11

Topics & keywords

Topics

Keywords

Hidden Markov model
Speech recognition
Computer science
Mixture model
Artificial neural network
Margin (machine learning)
Deep neural networks
Pattern recognition (psychology)

No related works found for this paper.