Abstract
Long short-term memory (LSTM) networks have been widely used in automatic speech recognition (ASR). This paper proposes a novel dynamic temporal residual learning mechanism for LSTM networks to better explore temporal dependencies in sequential data. The temporal residual learning mechanism is implemented by applying shortcut connections with dynamic weights to temporally adjacent LSTM outputs. Two types of dynamic weight generation methods are proposed: using a secondary network and using a random weight generator. Experimental results on Wall Street Journal (WSJ) speech recognition dataset reveal that our proposed methods have surpassed the baseline LSTM network.
Original language | English |
---|---|
Title of host publication | 2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2020 - Proceedings |
Pages | 7709-7713 |
Number of pages | 5 |
ISBN (Electronic) | 9781509066315 |
DOIs | |
State | Published - May 2020 |
Event | 2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2020 - Barcelona, Spain Duration: May 4 2020 → May 8 2020 |
Publication series
Name | ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings |
---|---|
Volume | 2020-May |
ISSN (Print) | 1520-6149 |
Conference
Conference | 2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2020 |
---|---|
Country/Territory | Spain |
City | Barcelona |
Period | 5/4/20 → 5/8/20 |
Bibliographical note
Funding Information:This research is supported by National Key R&D Program of China (No. 2019QY1804) and National Natural Science Foundation of China (No. U1836219 and No. U1636124).
Publisher Copyright:
© 2020 Institute of Electrical and Electronics Engineers Inc.. All rights reserved.
Keywords
- ASR
- Dynamic residual learning
- LSTM
ASJC Scopus subject areas
- Software
- Signal Processing
- Electrical and Electronic Engineering