Eigenvalue normalized recurrent neural networks for short term memory

Kyle Helfrich, Qiang Ye

Producción científica: Conference contributionrevisión exhaustiva

4 Citas (Scopus)

Resumen

Several variants of recurrent neural networks (RNNs) with orthogonal or unitary recurrent matrices have recently been developed to mitigate the vanishing/exploding gradient problem and to model long-term dependencies of sequences. However, with the eigenvalues of the recurrent matrix on the unit circle, the recurrent state retains all input information which may unnecessarily consume model capacity. In this paper, we address this issue by proposing an architecture that expands upon an orthogonal/unitary RNN with a state that is generated by a recurrent matrix with eigenvalues in the unit disc. Any input to this state dissipates in time and is replaced with new inputs, simulating short-term memory. A gradient descent algorithm is derived for learning such a recurrent matrix. The resulting method, called the Eigenvalue Normalized RNN (ENRNN), is shown to be highly competitive in several experiments.

Idioma originalEnglish
Título de la publicación alojadaAAAI 2020 - 34th AAAI Conference on Artificial Intelligence
Páginas4115-4122
Número de páginas8
ISBN (versión digital)9781577358350
DOI
EstadoPublished - 2020
Evento34th AAAI Conference on Artificial Intelligence, AAAI 2020 - New York, United States
Duración: feb 7 2020feb 12 2020

Serie de la publicación

NombreAAAI 2020 - 34th AAAI Conference on Artificial Intelligence

Conference

Conference34th AAAI Conference on Artificial Intelligence, AAAI 2020
País/TerritorioUnited States
CiudadNew York
Período2/7/202/12/20

Nota bibliográfica

Publisher Copyright:
© 2020, Association for the Advancement of Artificial Intelligence.

Financiación

Acknowledgments. This research was supported in part by NSF under grants DMS-1821144 and DMS-1620082.

FinanciadoresNúmero del financiador
U.S. Department of Energy Chinese Academy of Sciences Guangzhou Municipal Science and Technology Project Oak Ridge National Laboratory Extreme Science and Engineering Discovery Environment National Science Foundation National Energy Research Scientific Computing Center National Natural Science Foundation of China1821144, DMS-1821144, DMS-1620082

    ASJC Scopus subject areas

    • Artificial Intelligence

    Huella

    Profundice en los temas de investigación de 'Eigenvalue normalized recurrent neural networks for short term memory'. En conjunto forman una huella única.

    Citar esto