Share Email Print

Proceedings Paper

Voice conversion using dynamic features for high quality transformation
Author(s): Wei Wang; Zhen Yang
Format Member Price Non-Member Price
PDF $17.00 $21.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

A novel voice morphing method is proposed to make the speech of the source speaker sound like the voice uttered by a target speaker. This method is based on the Gaussian Mixture Model (GMM). However, the traditional GMM has the over-smoothed phenomenon and may get discontinuity of the converted speech due to the inaccuracy of the extracted feature information. In order to overcome it, we consider the dynamic spectral features between frames. The conversion function is also modified to deal with the discontinuities. The Speech Transformation and Representation using Adaptive Interpolation of weiGHTed spectrogram (STRAIGHT) algorithm is adopted for the analysis and synthesis process. Objective and perceptual experiments show that the quality of the speech converted by our proposed method is significantly improved compared with the traditional GMM method.

Paper Details

Date Published: 26 February 2010
PDF: 7 pages
Proc. SPIE 7546, Second International Conference on Digital Image Processing, 75463Q (26 February 2010); doi: 10.1117/12.855168
Show Author Affiliations
Wei Wang, Nanjing Univ. of Posts and Telecommunications (China)
Zhen Yang, Nanjing Univ. of Posts and Telecommunications (China)

Published in SPIE Proceedings Vol. 7546:
Second International Conference on Digital Image Processing
Kamaruzaman Jusoff; Yi Xie, Editor(s)

© SPIE. Terms of Use
Back to Top