Ir directamente a la navegación principal Ir directamente a la búsqueda Ir directamente al contenido principal

Orthogonal recurrent neural networks with scaled Cayley transform

  • Kyle E. Helfrich
  • , Devin Whimott
  • , Qiang Ye

Producción científica: Conference contributionrevisión exhaustiva

22 Citas (Scopus)

Resumen

Recurrent Neural Networks (RNNs) are designed to handle sequential data but suffer from vanishing or exploding gradients. Recent work on Unitary Recurrent Neural Networks (uRNNs) have been used to address this issue and in some cases, exceed the capabilities of Long Short-Term Memory networks (LSTMs). We propose a simpler and novel update scheme to maintain orthogonal recurrent weight matrices without using complex valued matrices. This is done by parametrizing with a skew-symmetric matrix using the Cayley transform; such a parametrization is unable to represent matrices with negative one eigenvalues, but this limitation is overcome by scaling the recurrent weight matrix by a diagonal matrix consisting of ones and negative ones. The proposed training scheme involves a straightforward gradient calculation and update step. In several experiments, the proposed scaled Cayley orthogonal recurrent neural network (scoRNN) achieves superior results with fewer trainable parameters than other unitary RNNs.

Idioma originalEnglish
Título de la publicación alojada35th International Conference on Machine Learning, ICML 2018
EditoresJennifer Dy, Andreas Krause
Páginas3133-3143
Número de páginas11
ISBN (versión digital)9781510867963
EstadoPublished - 2018
Evento35th International Conference on Machine Learning, ICML 2018 - Stockholm, Sweden
Duración: jul 10 2018jul 15 2018

Serie de la publicación

Nombre35th International Conference on Machine Learning, ICML 2018
Volumen5

Conference

Conference35th International Conference on Machine Learning, ICML 2018
País/TerritorioSweden
CiudadStockholm
Período7/10/187/15/18

Nota bibliográfica

Publisher Copyright:
© 2018 by authors.All right reserved.

Financiación

This research was supported in part by NSF Grants DMS-1317424 and DMS-1620082.

FinanciadoresNúmero del financiador
National Science Foundation (NSF)DMS-1317424, DMS-1620082

    ASJC Scopus subject areas

    • Computational Theory and Mathematics
    • Human-Computer Interaction
    • Software

    Huella

    Profundice en los temas de investigación de 'Orthogonal recurrent neural networks with scaled Cayley transform'. En conjunto forman una huella única.

    Citar esto