Stress and emotion classification using jitter and shimmer features

Li Xi, Tao Jidong, Michael T. Johnson, Joseph Solds, Anne Savage, Kirsten M. Leong, John D. Newman

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

75 Scopus citations

Abstract

In this paper, we evaluate the use of appended jitter and shimmer speech features for the classification of human speaking styles and of animal vocalization arousal levels. Jitter and shimmer features are extracted from the fundamental frequency contour and added to baseline spectral features, specifically Mel-frequency Cepstral Coefficients (MFCCs) for human speech and Greenwood Function Cepstral Coefficients (GFCCs) for animal vocalizations. Hidden Markov Models (HMMs) with Gaussian Mixture Models (GMMs) state distributions are used for classification. The appended jitter and shimmer features result in an increase in classification accuracy for several illustrative datasets, including the SUSAS dataset for human speaking styles as well as vocalizations labeled by arousal level for African Elephant and Rhesus Monkey species

Original languageEnglish
Title of host publication2007 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '07
Pages1081-1084
Number of pages4
DOIs
StatePublished - 2007
Event2007 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '07 - Honolulu, HI, United States
Duration: Apr 15 2007Apr 20 2007

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume4
ISSN (Print)1520-6149

Conference

Conference2007 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '07
Country/TerritoryUnited States
CityHonolulu, HI
Period4/15/074/20/07

Keywords

  • GFCC
  • HMM
  • Jitter
  • MFCC
  • Shimmer

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Stress and emotion classification using jitter and shimmer features'. Together they form a unique fingerprint.

Cite this