An approach is presented in this paper for vowel classification by analyzing the dynamics of speech production in a reconstructed phase space. The proposed approach has the ability of capturing nonlinearities that may exist in speech production. Global flow reconstruction is used to generate a quantitative description of the structure and trajectory of vowel attractors in a reconstructed phase space. A distance measure is defined to quantify the dynamic similarity between phoneme attractors. Templates of the dynamics for each vowel class are selected by cluster analysis. Classifying out-of-sample vowel phonemes is done using a nearest neighbor classifier. Experiments are conducted on both speaker dependent and independent vowel classification tasks using the TIMIT corpus. The preliminary experimental results show that vowel classification by nonlinear dynamics analysis can produce similar result when compared with a classifier using Mel frequency cepstral coefficient (MFCC) features.
|State||Published - 2003|
|Event||2003 ISCA Tutorial and Research Workshop on Nonlinear Speech Processing, NOLISP 2003 - Le Croisic, France|
Duration: May 20 2003 → May 23 2003
|Conference||2003 ISCA Tutorial and Research Workshop on Nonlinear Speech Processing, NOLISP 2003|
|Period||5/20/03 → 5/23/03|
Bibliographical noteFunding Information:
This material is based upon work supported by the National Science Foundation under Grant No. IIS-0113508.
© NOLISP 2003. All Rights Reserved.
ASJC Scopus subject areas
- Artificial Intelligence
- Signal Processing
- Linguistics and Language