Voice Conversion Based on Maximum-Likelihood Estimation of Spectral Parameter Trajectory

Toda, Tomoki; Black, Alan W.; Tokuda, Keiichi

doi:10.1109/tasl.2007.907344

articleIEEE Transactions on Audio Speech and Language ProcessingOct 15, 2007Closed access

Voice Conversion Based on Maximum-Likelihood Estimation of Spectral Parameter Trajectory

TTTomoki Toda AWAlan W. Black KTKeiichi Tokuda

Nara Institute of Science and Technology · Carnegie Mellon University · +1 more institution

Indexed incrossref

Abstract

In this paper, we describe a novel spectral conversion method for voice conversion (VC). A Gaussian mixture model (GMM) of the joint probability density of source and target features is employed for performing spectral conversion between speakers. The conventional method converts spectral parameters frame by frame based on the minimum mean square error. Although it is reasonably effective, the deterioration of speech quality is caused by some problems: 1) appropriate spectral movements are not always caused by the frame-based conversion process, and 2) the converted spectra are excessively smoothed by statistical modeling. In order to address those problems, we propose a conversion method based on the…

Citation impact

972

total citations

FWCI: 23.43
Percentile: 100%
References: 55

Citations per year

Authors

3

Topics & keywords

Topics

Keywords

Mixture model
Frame (networking)
Trajectory
Computer science
Feature (linguistics)
Gaussian
Mean squared error
Process (computing)

No related works found for this paper.