Resumen
Recurrent Neural Networks (RNNs) are designed to handle sequential data but suffer from vanishing or exploding gradients. Recent work on Unitary Recurrent Neural Networks (uRNNs) have been used to address this issue and in some cases, exceed the capabilities of Long Short-Term Memory networks (LSTMs). We propose a simpler and novel update scheme to maintain orthogonal recurrent weight matrices without using complex valued matrices. This is done by parametrizing with a skew-symmetric matrix using the Cayley transform; such a parametrization is unable to represent matrices with negative one eigenvalues, but this limitation is overcome by scaling the recurrent weight matrix by a diagonal matrix consisting of ones and negative ones. The proposed training scheme involves a straightforward gradient calculation and update step. In several experiments, the proposed scaled Cayley orthogonal recurrent neural network (scoRNN) achieves superior results with fewer trainable parameters than other unitary RNNs.
| Idioma original | English |
|---|---|
| Título de la publicación alojada | 35th International Conference on Machine Learning, ICML 2018 |
| Editores | Jennifer Dy, Andreas Krause |
| Páginas | 3133-3143 |
| Número de páginas | 11 |
| ISBN (versión digital) | 9781510867963 |
| Estado | Published - 2018 |
| Evento | 35th International Conference on Machine Learning, ICML 2018 - Stockholm, Sweden Duración: jul 10 2018 → jul 15 2018 |
Serie de la publicación
| Nombre | 35th International Conference on Machine Learning, ICML 2018 |
|---|---|
| Volumen | 5 |
Conference
| Conference | 35th International Conference on Machine Learning, ICML 2018 |
|---|---|
| País/Territorio | Sweden |
| Ciudad | Stockholm |
| Período | 7/10/18 → 7/15/18 |
Nota bibliográfica
Publisher Copyright:© 2018 by authors.All right reserved.
Financiación
This research was supported in part by NSF Grants DMS-1317424 and DMS-1620082.
| Financiadores | Número del financiador |
|---|---|
| National Science Foundation (NSF) | DMS-1317424, DMS-1620082 |
ASJC Scopus subject areas
- Computational Theory and Mathematics
- Human-Computer Interaction
- Software
Huella
Profundice en los temas de investigación de 'Orthogonal recurrent neural networks with scaled Cayley transform'. En conjunto forman una huella única.Citar esto
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver