Unit selection in a concatenative speech synthesis system using a large speech database
Advanced Telecommunications Research Institute International
Abstract
One approach to the generation of natural-sounding synthesized speech waveforms is to select and concatenate units from a large speech database. Units (in the current work, phonemes) are selected to produce a natural realisation of a target phoneme sequence predicted from text which is annotated with prosodic and phonetic context information. We propose that the units in a synthesis database can be considered as a state transition network in which the state occupancy cost is the distance between a database unit and a target, and the transition cost is an estimate of the quality of concatenation of two consecutive units. This framework has many similarities to HMM-based speech recognition. A pruned Viterbi…
Citation impact
- FWCI
- 55.43
- Percentile
- 100%
- References
- 5
Authors
2Topics & keywords
- Computer science
- Selection (genetic algorithm)
- Speech synthesis
- Speech recognition
- Natural language processing
- Artificial intelligence