articleDec 1, 2011Closed access
Feature engineering in Context-Dependent Deep Neural Networks for conversational speech transcription
Microsoft Research Asia (China) · Tsinghua University · +1 more institution
Indexed incrossref
Abstract
We investigate the potential of Context-Dependent Deep-Neural-Network HMMs, or CD-DNN-HMMs, from a feature-engineering perspective. Recently, we had shown that for speaker-independent transcription of phone calls (NIST RT03S Fisher data), CD-DNN-HMMs reduced the word error rate by as much as one third-from 27.4%, obtained by discriminatively trained Gaussian-mixture HMMs with HLDA features, to 18.5%-using 300+ hours of training data (Switchboard), 9000+ tied triphone states, and up to 9 hidden network layers.
Citation impact
638
total citations
- FWCI
- 79.12
- Percentile
- 100%
- References
- 25
Citations per year
Authors
4Topics & keywords
Topics
Keywords
- Speech recognition
- Computer science
- NIST
- Transcription (linguistics)
- Artificial neural network
- Phone
- Artificial intelligence
- Feature (linguistics)
UN Sustainable Development Goals
- Reduced inequalities
No related works found for this paper.