Abstract
Convolutional neural network is an important model in deep learning, where a convolution operation can be represented by a tensor. To avoid exploding/vanishing gradient problems and to improve the generalizability of a neural network, it is desirable to have a convolution operation that nearly preserves the norm, or to have the singular values of the transformation matrix corresponding to the tensor bounded around 1. We propose a penalty function that can constrain the singular values of the transformation matrix to be around 1. We derive an algorithm to carry out the gradient descent minimization of this penalty function in terms of convolution kernel tensors. Numerical examples are presented to demonstrate the effectiveness of the method.
Original language | English |
---|---|
Pages (from-to) | 1-13 |
Number of pages | 13 |
Journal | Linear and Multilinear Algebra |
DOIs | |
State | Published - 2020 |
Bibliographical note
Funding Information:Research supported by the Fundamental Research Funds for the Central Universities [Grant No. 2652019320] and China Scholarship Council. This work was partly done while this author was a visiting scholar at the Department of Mathematics, University of Kentucky, from July 2018 to July 2019. Research supported in part by NSF [Grant Nos. DMS-1821144 and DMS-1620082]. The authors are grateful to Professor Xinguo Liu at Ocean University of China and Professor Beatrice Meini at University of Pisa for their valuable suggestions.
Publisher Copyright:
© 2020, Informa UK Limited, trading as Taylor & Francis Group.
Keywords
- Penalty function
- convolutional layers
- generalizability
- transformation matrix
- unstable gradient
ASJC Scopus subject areas
- Algebra and Number Theory