Abstract
Large vocabulary continuous speech recognition can benefit from an efficient data structure for representing a large number of acoustic hypotheses compactly. Word graphs or lattices have been chosen as such an efficient interface between acoustic recognition engines and subsequent language processing modules. This paper first investigates the effect of pruning during acoustic decoding on the quality of word lattices and shows that by combining different pruning options (at the model level and word level), we can obtain word lattices with comparable accuracy to the original lattices and a manageable size. In order to use the word lattices as the input for a post-processing language module, they should preserve the target hypotheses and their scores while being as small as possible. In this paper, we introduce a word graph compression algorithm that significantly reduces the number of words in the graphical representation without eliminating utterance hypotheses or distorting their acoustic scores. We compare this word graph compression algorithm with several other lattice size-reducing approaches and demonstrate the relative strength of the new word graph compression algorithm for decreasing the number of words in the representation. Experiments are conducted across corpora and vocabulary sizes to determine the consistency of the pruning and compression results.
| Original language | English |
|---|---|
| Pages (from-to) | 329-356 |
| Number of pages | 28 |
| Journal | Computer Speech and Language |
| Volume | 17 |
| Issue number | 4 |
| DOIs | |
| State | Published - Oct 2003 |
Bibliographical note
Funding Information:This research was supported by Intel, Purdue Research Foundation, and National Science Foundation under Grant No. IRI 97-04358, CDA 96-17388, and BCS-9980054.
Funding
This research was supported by Intel, Purdue Research Foundation, and National Science Foundation under Grant No. IRI 97-04358, CDA 96-17388, and BCS-9980054.
| Funders | Funder number |
|---|---|
| U.S. Department of Energy Chinese Academy of Sciences Guangzhou Municipal Science and Technology Project Oak Ridge National Laboratory Extreme Science and Engineering Discovery Environment National Science Foundation National Energy Research Scientific Computing Center National Natural Science Foundation of China | CDA 96-17388, BCS-9980054, IRI 97-04358 |
| U.S. Department of Energy Chinese Academy of Sciences Guangzhou Municipal Science and Technology Project Oak Ridge National Laboratory Extreme Science and Engineering Discovery Environment National Science Foundation National Energy Research Scientific Computing Center National Natural Science Foundation of China | |
| Intel Corporation | |
| Purdue University Research Foundation |
ASJC Scopus subject areas
- Software
- Theoretical Computer Science
- Human-Computer Interaction
Fingerprint
Dive into the research topics of 'The effect of pruning and compression on graphical representations of the output of a speech recognizer'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver