Vocal source features for bilingual speaker identification

Jianglin Wang, Michael T. Johnson

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

6 Scopus citations

Abstract

This paper introduces the use of two new features for speaker identification, Residual Phase Cepstrum Coefficients (RPCC) and Glottal Flow Cepstrum Coefficients (GLFCC), to capture speaker-specific characteristics from their vocal excitation patterns. Results on a cross-lingual speaker identification task taken from the NIST 2004 SRE demonstrate that these RPCC and GLFCC features are significantly more accurate than traditional mel-frequency cepstral coefficients (MFCC). In particular, these two new features give better results with smaller amounts of training data, due to lower model complexity.

Original languageEnglish
Title of host publication2013 IEEE China Summit and International Conference on Signal and Information Processing, ChinaSIP 2013 - Proceedings
Pages170-173
Number of pages4
DOIs
StatePublished - 2013
Event2013 IEEE China Summit and International Conference on Signal and Information Processing, ChinaSIP 2013 - Beijing, China
Duration: Jul 6 2013Jul 10 2013

Publication series

Name2013 IEEE China Summit and International Conference on Signal and Information Processing, ChinaSIP 2013 - Proceedings

Conference

Conference2013 IEEE China Summit and International Conference on Signal and Information Processing, ChinaSIP 2013
Country/TerritoryChina
CityBeijing
Period7/6/137/10/13

Keywords

  • Glottal source excitation
  • IAIF and GMM
  • Speaker identification

ASJC Scopus subject areas

  • Information Systems
  • Signal Processing

Fingerprint

Dive into the research topics of 'Vocal source features for bilingual speaker identification'. Together they form a unique fingerprint.

Cite this