Improving the Accuracy-Latency Trade-off of Edge-Cloud Computation Offloading for Deep Learning Services

Xiaobo Zhao, Minoo Hosseinzadeh, Nathaniel Hudson, Hana Khamfroush, Daniel E. Lucani

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

8 Scopus citations


Offloading tasks to the edge or the Cloud has the potential to improve accuracy of classification and detection tasks as more powerful hardware and machine learning models can be used. The downside is the added delay introduced for sending the data to the Edge/Cloud. In delay-sensitive applications, it is usually necessary to strike a balance between accuracy and latency. However, the state of the art typically considers offloading all-or-nothing decisions, e.g., process locally or send all available data to the Edge (Cloud). Our goal is to expand the options in the accuracy-latency trade-off by allowing the source to send a fraction of the total data for processing. We evaluate the performance of image classifiers when faced with images that have been purposely reduced in quality in order to reduce traffic costs. Using three common models (SqueezeNet, GoogleNet, ResNet) and two data sets (Caltech101, ImageNet) we show that the Gompertz function provides a good approximation to determine the accuracy of a model given the fraction of the data of the image that is actually conveyed to the model. We formulate the offloading decision process using this new flexibility and show that a better overall accuracy-latency tradeoff is attained: 58% traffic reduction, 25% latency reduction, as well as 12% accuracy improvement.

Original languageEnglish
Title of host publication2020 IEEE Globecom Workshops, GC Wkshps 2020 - Proceedings
ISBN (Electronic)9781728173078
StatePublished - Dec 2020
Event2020 IEEE Globecom Workshops, GC Wkshps 2020 - Virtual, Taipei, Taiwan, Province of China
Duration: Dec 7 2020Dec 11 2020

Publication series

Name2020 IEEE Globecom Workshops, GC Wkshps 2020 - Proceedings


Conference2020 IEEE Globecom Workshops, GC Wkshps 2020
Country/TerritoryTaiwan, Province of China
CityVirtual, Taipei

Bibliographical note

Funding Information:
V. CONCLUSIONS Using a thorough evaluation on multiple image classification models and data sets, we show that judicious reductions of image size (and quality) results in smooth reductions in expected accuracy. More specifically, we show that (a) the transitions can be modeled well using a Gompertz function with few parameters to fit for the given classification model and expected workloads; and (b) reductions of image size of 58% are possible with a minor reduction in accuracy. This paper also shows that these interesting features can be used to make task offloading decisions, particularly, when both latency and accuracy are crucial in the system. Our numerical results show that offloading decisions that send all data to the Edge or the Cloud are suboptimal in general from this perspective. The optimal solution results in reductions of traffic of up to 58% while also obtaining 25% overall latency reduction and, in some scenarios, even an increased in the attained accuracy given the ability to select and manage multiple classification models. Future work will focus on multi-user, multi-edge and multi-cloud scenarios using a similar intuition to our current analysis. We will also study classification models that are able to exploit the additional compression introduced in our system to reduce the processing cost of the model itself and, thus, reduce overall latency. ACKNOWLEDGMENT This work was partially financed by Cisco Systems Inc. under the research grant No.1215519250, and by the Aarhus Universitets Forskningsfond (AUFF) Starting Grant Project AUFF-2017-FLS-7-1, and Aarhus Universitys DIGIT Centre.

Publisher Copyright:
© 2020 IEEE.

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Networks and Communications
  • Computer Science Applications
  • Computer Vision and Pattern Recognition
  • Hardware and Architecture
  • Software


Dive into the research topics of 'Improving the Accuracy-Latency Trade-off of Edge-Cloud Computation Offloading for Deep Learning Services'. Together they form a unique fingerprint.

Cite this