Physiologically-motivated feature extraction for speaker identification

Jianglin Wang, Michael T. Johnson

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

14 Scopus citations

Abstract

This paper introduces the use of three physiologically-motivated features for speaker identification, Residual Phase Cepstrum Coefficients (RPCC), Glottal Flow Cepstrum Coefficients (GLFCC) and Teager Phase Cepstrum Coefficients (TPCC). These features capture speaker-discriminative characteristics from different aspects of glottal source excitation patterns. The proposed physiologically-driven features give better results with lower model complexities, and also provide complementary information that can improve overall system performance even for larger amounts of data. Results on speaker identification using the YOHO corpus demonstrate that these physiologically-driven features are both more accurate than and complementary to traditional mel-frequency cepstral coefficients (MFCC). In particular, the incorporation of the proposed glottal source features offers significant overall improvement to the robustness and accuracy of speaker identification tasks.

Original languageEnglish
Title of host publication2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014
Pages1690-1694
Number of pages5
DOIs
StatePublished - 2014
Event2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014 - Florence, Italy
Duration: May 4 2014May 9 2014

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN (Print)1520-6149

Conference

Conference2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014
Country/TerritoryItaly
CityFlorence
Period5/4/145/9/14

Keywords

  • Glottal source excitation and GMM-UBM
  • Speaker distinctive feature
  • Speaker identification

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Physiologically-motivated feature extraction for speaker identification'. Together they form a unique fingerprint.

Cite this