Abstract
In this paper, we study the generalization performance of overparameterized 3-layer NTK models. We show that, for a specific set of ground-truth functions (which we refer to as the “learnable set”), the test error of the overfitted 3-layer NTK is upper bounded by an expression that decreases with the number of neurons of the two hidden layers. Different from 2-layer NTK where there exists only one hidden-layer, the 3-layer NTK involves interactions between two hidden-layers. Our upper bound reveals that, between the two hidden-layers, the test error descends faster with respect to the number of neurons in the second hidden-layer (the one closer to the output) than with respect to that in the first hidden-layer (the one closer to the input). We also show that the learnable set of 3-layer NTK without bias is no smaller than that of 2-layer NTK models with various choices of bias in the neurons. However, in terms of the actual generalization performance, our results suggest that 3-layer NTK is much less sensitive to the choices of bias than 2-layer NTK, especially when the input dimension is large.
| Original language | English |
|---|---|
| Title of host publication | Advances in Neural Information Processing Systems 35 - 36th Conference on Neural Information Processing Systems, NeurIPS 2022 |
| Editors | S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, A. Oh |
| ISBN (Electronic) | 9781713871088 |
| State | Published - 2022 |
| Event | 36th Conference on Neural Information Processing Systems, NeurIPS 2022 - New Orleans, United States Duration: Nov 28 2022 → Dec 9 2022 |
Publication series
| Name | Advances in Neural Information Processing Systems |
|---|---|
| Volume | 35 |
| ISSN (Print) | 1049-5258 |
Conference
| Conference | 36th Conference on Neural Information Processing Systems, NeurIPS 2022 |
|---|---|
| Country/Territory | United States |
| City | New Orleans |
| Period | 11/28/22 → 12/9/22 |
Bibliographical note
Publisher Copyright:© 2022 Neural information processing systems foundation. All rights reserved.
Funding
This work has been supported in part by NSF grants: 2112471 (also partly funded by DHS), CNS-2106933, CNS-1901057, CNS-2113893, and the Bilsland Dissertation Fellowship at Purdue University, and a grant from the Army Research Office: W911NF-21-1-0244.
| Funders | Funder number |
|---|---|
| National Science Foundation Arctic Social Science Program | 2112471 |
| U.S. Department of Homeland Security | CNS-2113893, CNS-1901057, CNS-2106933 |
| Army Research Office | W911NF-21-1-0244 |
| Purdue Climate Change Research Center, Purdue University |
ASJC Scopus subject areas
- Signal Processing
- Information Systems
- Computer Networks and Communications