Improved techniques for training adaptive deep networks

Hao Li, Hong Zhang, Xiaojuan Qi, Yang Ruigang, Gao Huang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

96 Scopus citations


Adaptive inference is a promising technique to improve the computational efficiency of deep models at test time. In contrast to static models which use the same computation graph for all instances, adaptive networks can dynamically adjust their structure conditioned on each input. While existing research on adaptive inference mainly focuses on designing more advanced architectures, this paper investigates how to train such networks more effectively. Specifically, we consider a typical adaptive deep network with multiple intermediate classifiers. We present three techniques to improve its training efficacy from two aspects: 1) a Gradient Equilibrium algorithm to resolve the conflict of learning of different classifiers; 2) an Inline Subnetwork Collaboration approach and a One-for-all Knowledge Distillation algorithm to enhance the collaboration among classifiers. On multiple datasets (CIFAR-10, CIFAR-100 and ImageNet), we show that the proposed approach consistently leads to further improved efficiency on top of state-of-the-art adaptive deep networks.

Original languageEnglish
Title of host publicationProceedings - 2019 International Conference on Computer Vision, ICCV 2019
Number of pages10
ISBN (Electronic)9781728148038
StatePublished - Oct 2019
Event17th IEEE/CVF International Conference on Computer Vision, ICCV 2019 - Seoul, Korea, Republic of
Duration: Oct 27 2019Nov 2 2019

Publication series

NameProceedings of the IEEE International Conference on Computer Vision
ISSN (Print)1550-5499


Conference17th IEEE/CVF International Conference on Computer Vision, ICCV 2019
Country/TerritoryKorea, Republic of

Bibliographical note

Publisher Copyright:
© 2019 IEEE.

ASJC Scopus subject areas

  • Software
  • Computer Vision and Pattern Recognition


Dive into the research topics of 'Improved techniques for training adaptive deep networks'. Together they form a unique fingerprint.

Cite this