Abstract
Convolutional neural network is an important model in deep learning, where a convolution operation can be represented by a tensor. To avoid exploding/vanishing gradient problems and to improve the generalizability of a neural network, it is desirable to have a convolution operation that nearly preserves the norm, or to have the singular values of the transformation matrix corresponding to the tensor bounded around 1. We propose a penalty function that can constrain the singular values of the transformation matrix to be around 1. We derive an algorithm to carry out the gradient descent minimization of this penalty function in terms of convolution kernel tensors. Numerical examples are presented to demonstrate the effectiveness of the method.
Original language | English |
---|---|
Pages (from-to) | 1-13 |
Number of pages | 13 |
Journal | Linear and Multilinear Algebra |
DOIs | |
State | Published - 2020 |
Bibliographical note
Publisher Copyright:© 2020, Informa UK Limited, trading as Taylor & Francis Group.
Keywords
- Penalty function
- convolutional layers
- generalizability
- transformation matrix
- unstable gradient
ASJC Scopus subject areas
- Algebra and Number Theory