On the Generalization Power of the Overfitted Three-Layer Neural Tangent Kernel Model

Peizhong Ju, Xiaojun Lin, Ness B. Shroff

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

In this paper, we study the generalization performance of overparameterized 3-layer NTK models. We show that, for a specific set of ground-truth functions (which we refer to as the “learnable set”), the test error of the overfitted 3-layer NTK is upper bounded by an expression that decreases with the number of neurons of the two hidden layers. Different from 2-layer NTK where there exists only one hidden-layer, the 3-layer NTK involves interactions between two hidden-layers. Our upper bound reveals that, between the two hidden-layers, the test error descends faster with respect to the number of neurons in the second hidden-layer (the one closer to the output) than with respect to that in the first hidden-layer (the one closer to the input). We also show that the learnable set of 3-layer NTK without bias is no smaller than that of 2-layer NTK models with various choices of bias in the neurons. However, in terms of the actual generalization performance, our results suggest that 3-layer NTK is much less sensitive to the choices of bias than 2-layer NTK, especially when the input dimension is large.

Original languageEnglish
Title of host publicationAdvances in Neural Information Processing Systems 35 - 36th Conference on Neural Information Processing Systems, NeurIPS 2022
EditorsS. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, A. Oh
ISBN (Electronic)9781713871088
StatePublished - 2022
Event36th Conference on Neural Information Processing Systems, NeurIPS 2022 - New Orleans, United States
Duration: Nov 28 2022Dec 9 2022

Publication series

NameAdvances in Neural Information Processing Systems
Volume35
ISSN (Print)1049-5258

Conference

Conference36th Conference on Neural Information Processing Systems, NeurIPS 2022
Country/TerritoryUnited States
CityNew Orleans
Period11/28/2212/9/22

Bibliographical note

Publisher Copyright:
© 2022 Neural information processing systems foundation. All rights reserved.

Funding

This work has been supported in part by NSF grants: 2112471 (also partly funded by DHS), CNS-2106933, CNS-1901057, CNS-2113893, and the Bilsland Dissertation Fellowship at Purdue University, and a grant from the Army Research Office: W911NF-21-1-0244.

FundersFunder number
National Science Foundation Arctic Social Science Program2112471
U.S. Department of Homeland SecurityCNS-2113893, CNS-1901057, CNS-2106933
Army Research OfficeW911NF-21-1-0244
Purdue Climate Change Research Center, Purdue University

    ASJC Scopus subject areas

    • Signal Processing
    • Information Systems
    • Computer Networks and Communications

    Fingerprint

    Dive into the research topics of 'On the Generalization Power of the Overfitted Three-Layer Neural Tangent Kernel Model'. Together they form a unique fingerprint.

    Cite this