QoS-Aware Placement of Deep Learning Services on the Edge with Multiple Service Implementations

Nathaniel Hudson, Hana Khamfroush, Daniel E. Lucani

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

21 Scopus citations

Abstract

Mobile edge computing pushes computationally-intensive services closer to the user to provide reduced delay due to physical proximity. This has led many to consider deploying deep learning models on the edge-commonly known as edge intelligence (EI). EI services can have many model implementations that provide different QoS. For instance, one model can perform inference faster than another (thus reducing latency) while achieving less accuracy when evaluated. In this paper, we study joint service placement and model scheduling of EI services with the goal to maximize Quality-of-Servcice (QoS) for end users where EI services have multiple implementations to serve user requests, each with varying costs and QoS benefits. We cast the problem as an integer linear program and prove that it is NP-hard. We then prove the objective is equivalent to maximizing a monotone increasing, submodular set function and thus can be solved greedily while maintaining a (1-1/e)-Approximation guarantee. We then propose two greedy algorithms: one that theoretically guarantees this approximation and another that empirically matches its performance with greater efficiency. Finally, we thoroughly evaluate the proposed algorithm for making placement and scheduling decisions in both synthetic and real-world scenarios against the optimal solution and some baselines. In the real-world case, we consider real machine learning models using the ImageNet 2012 data-set for requests. Our numerical experiments empirically show that our more efficient greedy algorithm is able to approximate the optimal solution with a 0.904 approximation on average, while the next closest baseline achieves a 0.607 approximation on average.

Original languageEnglish
Title of host publication30th International Conference on Computer Communications and Networks, ICCCN 2021
ISBN (Electronic)9780738113302
DOIs
StatePublished - Jul 2021
Event30th International Conference on Computer Communications and Networks, ICCCN 2021 - Virtual, Athens, Greece
Duration: Jul 19 2021Jul 22 2021

Publication series

NameProceedings - International Conference on Computer Communications and Networks, ICCCN
Volume2021-July
ISSN (Print)1095-2055

Conference

Conference30th International Conference on Computer Communications and Networks, ICCCN 2021
Country/TerritoryGreece
CityVirtual, Athens
Period7/19/217/22/21

Bibliographical note

Publisher Copyright:
© 2021 IEEE.

Keywords

  • Deep Learning
  • Edge Computing
  • Edge Intelligence
  • Optimization
  • Quality-of-Service
  • Service Placement

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Hardware and Architecture
  • Software

Fingerprint

Dive into the research topics of 'QoS-Aware Placement of Deep Learning Services on the Edge with Multiple Service Implementations'. Together they form a unique fingerprint.

Cite this