articleMay 1, 2013Closed access

Statistical parametric speech synthesis using deep neural networks

Google (United States)

Indexed incrossref

Abstract

Conventional approaches to statistical parametric speech synthesis typically use decision tree-clustered context-dependent hidden Markov models (HMMs) to represent probability densities of speech parameters given texts. Speech parameters are generated from the probability densities to maximize their output probabilities, then a speech waveform is reconstructed from the generated parameters. This approach is reasonably effective but has a couple of limitations, e.g. decision trees are inefficient to model complex context dependencies. This paper examines an alternative scheme that is based on a deep neural network (DNN). The relationship between input texts and their acoustic realizations is modeled by a DNN.…

Citation impact

831
total citations
FWCI
89.95
Percentile
100%
References
49
Citations per year

Authors

3

Topics & keywords

Keywords
  • Hidden Markov model
  • Computer science
  • Parametric statistics
  • Artificial neural network
  • Speech recognition
  • Context (archaeology)
  • Decision tree
  • Waveform
UN Sustainable Development Goals
  • Peace, Justice and strong institutions
No related works found for this paper.