Abstract
Voice conversion (VC) is the task of modifying a source speaker's voice to match that of a specific target speaker. Traditional methods use Gaussian mixture models (GMM), but the converted speech quality is often badly degraded due to over-smoothing. More recent approaches such as Dynamic Frequency Warping (DFW) maintain more spectrum details during transformation, but require specific formant frequency estimates, with estimation errors resulting in poor similarity between source and target speakers. This paper proposes a new method for voice conversion called Minimum Distance Spectral Mapping (MDSM), based on a frequency-warped point-To-point mapping that robustly and accurately transforms formant frequencies while also maintaining spectral details. The proposed MDSM method uses a minimum distance alignment between source and target speakers, rather than direct formant estimates, which increases robustness and also preserves other spectral details such as formant bandwidth. Results show that the proposed method offers a good trade-off between voice quality and identity similarity, outperforming traditional GMM and DFW in both subjective and objective evaluations.
Original language | English |
---|---|
Title of host publication | 2015 5th International Conference on Information Science and Technology, ICIST 2015 |
Pages | 356-359 |
Number of pages | 4 |
ISBN (Electronic) | 9781479974894 |
DOIs | |
State | Published - Oct 2 2015 |
Event | 5th International Conference on Information Science and Technology, ICIST 2015 - Changsha, Hunan, China Duration: Apr 24 2015 → Apr 26 2015 |
Publication series
Name | 2015 5th International Conference on Information Science and Technology, ICIST 2015 |
---|
Conference
Conference | 5th International Conference on Information Science and Technology, ICIST 2015 |
---|---|
Country/Territory | China |
City | Changsha, Hunan |
Period | 4/24/15 → 4/26/15 |
Bibliographical note
Publisher Copyright:© 2015 IEEE.
Keywords
- Gaussian mixture models
- Point-To-point mapping
- Voice Conversion
- frequency warping
ASJC Scopus subject areas
- Information Systems