Efficient embedded speech recognition for very large vocabulary Mandarin car-navigation systems

Yanmin Qian, Jia Liu, Michael T. Johnson

Research output: Contribution to journalArticlepeer-review

11 Scopus citations

Abstract

Automatic speech recognition (ASR) for a very large vocabulary of isolated words is a difficult task on a resource-limited embedded device. This paper presents a novel fast decoding algorithm for a mandarin speech recognition system which can simultaneously process hundreds of thousands of items and maintain high recognition accuracy. The proposed algorithm constructs a semi-tree search network based on mandarin pronunciation rules, to avoid duplicate syllable matching and save redundant memory. Based on a two-stage fixed-width beam-search baseline system, the algorithm employs a variable beam-width pruning strategy and a frame-synchronous word-level pruning strategy to significantly reduce recognition time. This algorithm is aimed at an in-car navigation system in China and simulated on a standard PC workstation. The experimental results show that the proposed method reduces recognition time by nearly 6-fold and memory size nearly 2- fold compared to the baseline system, and causes less than 1% accuracy degradation for a 200,000 word recognition task.

Original languageEnglish
Pages (from-to)1496-1500
Number of pages5
JournalIEEE Transactions on Consumer Electronics
Volume55
Issue number3
DOIs
StatePublished - 2009

Bibliographical note

Funding Information:
1This work was supported by the National High Technology Research and Development Program of China (NO. 2006AA010101 and NO. 2007AA04Z223) and National Natural Science Foundation of China and Microsoft Research Asia (NO. 60776800). Yanmin Qian and Jia Liu are with the Department of Electronic Engineering, Tsinghua University, Beijing 100084, China (e-mail: [email protected]). Michael T. Johnson is with the Department of Electrical Engineering, Marquette University, Milwaukee, Wisconsin 53201, USA. He is now a visiting professor with the Department of Electronic Engineering, Tsinghua University, Beijing 100084, China (e-mail: [email protected]). Contributed Paper Manuscript received June 11, 2009 0098 3063/09/$20.00 © 2009 IEEE

Keywords

  • Beam-search
  • Search network
  • Speech recognition
  • Word-level pruning

ASJC Scopus subject areas

  • Media Technology
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Efficient embedded speech recognition for very large vocabulary Mandarin car-navigation systems'. Together they form a unique fingerprint.

Cite this