articleMay 1, 2013Closed access
Statistical parametric speech synthesis using deep neural networks
Indexed incrossref
Abstract
Conventional approaches to statistical parametric speech synthesis typically use decision tree-clustered context-dependent hidden Markov models (HMMs) to represent probability densities of speech parameters given texts. Speech parameters are generated from the probability densities to maximize their output probabilities, then a speech waveform is reconstructed from the generated parameters. This approach is reasonably effective but has a couple of limitations, e.g. decision trees are inefficient to model complex context dependencies. This paper examines an alternative scheme that is based on a deep neural network (DNN). The relationship between input texts and their acoustic realizations is modeled by a DNN.…
Citation impact
831
total citations
- FWCI
- 89.95
- Percentile
- 100%
- References
- 49
Citations per year
Authors
3Topics & keywords
Topics
Keywords
- Hidden Markov model
- Computer science
- Parametric statistics
- Artificial neural network
- Speech recognition
- Context (archaeology)
- Decision tree
- Waveform
UN Sustainable Development Goals
- Peace, Justice and strong institutions
No related works found for this paper.