CR-SAM: Curvature Regularized Sharpness-Aware Minimization

Tao Wu, Tie Luo, Donald C. Wunsch

Research output: Contribution to journalConference articlepeer-review

5 Scopus citations

Abstract

The capacity to generalize to future unseen data stands as one of the utmost crucial attributes of deep neural networks. Sharpness-Aware Minimization (SAM) aims to enhance the generalizability by minimizing worst-case loss using one-step gradient ascent as an approximation. However, as training progresses, the non-linearity of the loss landscape increases, rendering one-step gradient ascent less effective. On the other hand, multi-step gradient ascent will incur higher training cost. In this paper, we introduce a normalized Hessian trace to accurately measure the curvature of loss landscape on both training and test sets. In particular, to counter excessive non-linearity of loss landscape, we propose Curvature Regularized SAM (CR-SAM), integrating the normalized Hessian trace as a SAM regularizer. Additionally, we present an efficient way to compute the trace via finite differences with parallelism. Our theoretical analysis based on PAC-Bayes bounds establishes the regularizer’s efficacy in reducing generalization error. Empirical evaluation on CIFAR and ImageNet datasets shows that CR-SAM consistently enhances classification performance for ResNet and Vision Transformer (ViT) models across various datasets. Our code is available at https://github.com/TrustAIoT/CR-SAM.

Original languageEnglish
Pages (from-to)6144-6152
Number of pages9
JournalProceedings of the AAAI Conference on Artificial Intelligence
Volume38
Issue number6
DOIs
StatePublished - Mar 25 2024
Event38th AAAI Conference on Artificial Intelligence, AAAI 2024 - Vancouver, Canada
Duration: Feb 20 2024Feb 27 2024

Bibliographical note

Publisher Copyright:
Copyright © 2024, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.

Funding

This work was supported in part by the National Science Foundation (NSF) under Grant No. 2008878, and in part by the Air Force Research Laboratory (AFRL) and the Lifelong Learning Machines program by DARPA/MTO under Contract No. FA8650-18-C-7831. The research was also sponsored by the Army Research Laboratory and was accomplished under Cooperative Agreement Number W911NF-22-2-0209.

FundersFunder number
Air Force Research Laboratory
Defense Advanced Research Projects Agency
U.S. Department of Energy Chinese Academy of Sciences Guangzhou Municipal Science and Technology Project Oak Ridge National Laboratory Extreme Science and Engineering Discovery Environment National Science Foundation National Energy Research Scientific Computing Center National Natural Science Foundation of China2008878
U.S. Department of Energy Chinese Academy of Sciences Guangzhou Municipal Science and Technology Project Oak Ridge National Laboratory Extreme Science and Engineering Discovery Environment National Science Foundation National Energy Research Scientific Computing Center National Natural Science Foundation of China
Microsystems Technology OfficeFA8650-18-C-7831
Microsystems Technology Office
Army Research LaboratoryW911NF-22-2-0209
Army Research Laboratory

    ASJC Scopus subject areas

    • Artificial Intelligence

    Fingerprint

    Dive into the research topics of 'CR-SAM: Curvature Regularized Sharpness-Aware Minimization'. Together they form a unique fingerprint.

    Cite this