Understanding the Generalization Power of Overfitted NTK Models: 3-layer vs. 2-layer (Extended Abstract)

Peizhong Ju, Xiaojun Lin, Ness B. Shroff

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Neural tangent kernel (NTK) models [1] have been recently used as an important intermediate step to understand the exceptional generalization power of overparameterized deep neural networks (DNNs). Compared to linear models with simple Gaussian or Fourier features, NTK models can capture the nonlinear features inherent in neural networks. Indeed, the work in [2] has shown that, for a 2-layer NTK model, the generalization error of an overfitted solution decreases as the number of neurons increases. Further, this descent behavior is qualitatively different from that of linear models with simple Gaussian and Fourier features, and closer to that of an actual neural network.

Original languageEnglish
Title of host publication2022 58th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2022
ISBN (Electronic)9798350399981
DOIs
StatePublished - 2022
Event58th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2022 - Monticello, United States
Duration: Sep 27 2022Sep 30 2022

Publication series

Name2022 58th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2022

Conference

Conference58th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2022
Country/TerritoryUnited States
CityMonticello
Period9/27/229/30/22

Bibliographical note

Publisher Copyright:
© 2022 IEEE.

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Networks and Communications
  • Computer Science Applications
  • Computer Vision and Pattern Recognition
  • Signal Processing
  • Control and Optimization

Fingerprint

Dive into the research topics of 'Understanding the Generalization Power of Overfitted NTK Models: 3-layer vs. 2-layer (Extended Abstract)'. Together they form a unique fingerprint.

Cite this