Abstract
The long short-term memory (LSTM) network with gating mechanism has been widely used in sequence modeling tasks including handwriting and speech recognition. As an LSTM network can be unfolded along the temporal dimension and its temporal depth is equal to the length of the input feature sequence, the introduction of gating might not be sufficient to completely model the dynamic temporal dependencies in sequential data. Inspired by the residual learning in ResNet, this paper proposes a dynamic temporal residual network (DTRN) by incorporating residual learning into an LSTM network along the temporal dimension. DTRN involves two networks: Its primary network consists of modified LSTM units with weighted shortcut connections for adjacent temporal outputs, while its secondary network generates dynamic weights for the shortcut connections. To validate the performance of DTRN, we conduct experiments on three commonly used public handwriting recognition datasets (IFN/ENIT, IAM and Rimes) and one speech recognition dataset (TIMIT). The experimental results show that the proposed DTRN has outperformed previously reported methods.
| Original language | English |
|---|---|
| Pages (from-to) | 235-246 |
| Number of pages | 12 |
| Journal | International Journal on Document Analysis and Recognition |
| Volume | 22 |
| Issue number | 3 |
| DOIs | |
| State | Published - Sep 1 2019 |
Bibliographical note
Publisher Copyright:© 2019, Springer-Verlag GmbH Germany, part of Springer Nature.
Funding
The authors would like to thank the anonymous reviewers for their valuable comments and suggestions to improve the quality of the paper. This research is supported by National Natural Science Foundation of China under Grant U1636124, 61471214 and 61573028. This research is also supported by China Scholarship Council.
| Funders | Funder number |
|---|---|
| National Natural Science Foundation of China (NSFC) | 61573028, U1636124, 61471214 |
| China Scholarship Council |
Keywords
- Long short-term memory
- Off-line handwriting recognition
- Residual learning
- Speech recognition
ASJC Scopus subject areas
- Software
- Computer Vision and Pattern Recognition
- Computer Science Applications