Capacity and complexity of HMM duration modeling techniques

Research output: Contribution to journalArticlepeer-review

61 Scopus citations

Abstract

The ability of a standard hidden Markov model (HMM) or expanded state HMM (ESHMM) to accurately model duration distributions of phonemes is compared with specific duration-focused approaches such as semi-Markov models or variable transition probabilities. It is demonstrated that either a three-state ESHMM or a standard HMM with an increased number of states is capable of closely matching both Gamma distributions and duration distributions of phonemes from the TIMIT corpus, as measured by Bhattacharyya distance to the true distributions. Standard HMMs are easily implemented with off-the-shelf tools, whereas duration models require substantial algorithmic development and have higher computational costs when implemented, suggesting that a simple adjustment to HMM topologies is perhaps a more efficient solution to the problem of duration than more complex approaches.

Original languageEnglish
Pages (from-to)407-410
Number of pages4
JournalIEEE Signal Processing Letters
Volume12
Issue number5
DOIs
StatePublished - May 2005

Keywords

  • Duration models
  • Hidden Markov models
  • Speech recognition

ASJC Scopus subject areas

  • Signal Processing
  • Electrical and Electronic Engineering
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'Capacity and complexity of HMM duration modeling techniques'. Together they form a unique fingerprint.

Cite this