On the generalization power of overfitted two-layer neural tangent Kernel models

Peizhong Ju, Xiaojun Lin, Ness B. Shroff

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

Abstract

In this chapter, we study the generalization performance of min-norm overfitting solutions for the neural tangent kernel (NTK) model of a two-layer neural network with ReLU activation that has no bias term. We show that, depending on the ground-truth function, the test error of overfitted NTK models exhibits characteristics that are different from the "double-descent" of other overparameterized linear models with simple Fourier or Gaussian features. Specifically, for a class of learnable functions, we derive a new upper bound of the generalization error that approaches a small limiting value, even when the number of neurons p approaches infinity. This limiting value further decreases with the number of training samples n. For functions outside of this class, we provide a lower bound on the generalization error that does not diminish to zero even when n and p are both large.

Original languageEnglish
Title of host publicationArtificial Intelligence for Edge Computing
Pages111-135
Number of pages25
ISBN (Electronic)9783031407871
DOIs
StatePublished - Dec 21 2023

Bibliographical note

Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023. All rights reserved.

ASJC Scopus subject areas

  • General Computer Science
  • General Engineering

Fingerprint

Dive into the research topics of 'On the generalization power of overfitted two-layer neural tangent Kernel models'. Together they form a unique fingerprint.

Cite this