Edge AI as a Service: Configurable Model Deployment and Delay-Energy Optimization With Result Quality Constraints

Wenyu Zhang, Sherali Zeadally, Wei Li, Haijun Zhang, Jingyi Hou, Victor C.M. Leung

Research output: Contribution to journalArticlepeer-review

14 Scopus citations

Abstract

The breakthrough of artificial intelligence (AI) techniques has accelerated their applications in a wide range of industries, such as security protection, transportation, agriculture, and medical care. With the support of edge computing environments, providing latency guaranteed AI as a Service (AIaaS) can accelerate the deployment of data-intensive and computation-intensive AI applications and reduce the investment cost of the customers. However, the deployment architecture and working mechanism design, and performance optimization problems specific for AIaaS with configurable data quality and model complexity have not been studied in existing works. To address the problem, we propose a configurable model deployment architecture (CMDA) for edge AIaaS and present a flexible working mechanism by enabling the joint configuration of data quality ratios (DQRs) and model complexity ratios (MCRs) for the AI tasks. Along with commonly used resource allocation operations, the manager can improve the energy and delay performance of AI services with the desired quality of results (QoRs). We develop an energy-delay minimization problem under the framework of CMDA and propose a polynomial regression based relaxing method to solve the task configuration subproblem. We conduct experiments and simulations on the ImageNet classification and the common objects in context (COCO) object detection tasks using state-of-the-art deep learning models. We present the corresponding result quality tables (RQTs) and QoR regression models to illustrate the proposed method. The results of single task configuration and multi-task configuration and resource allocation on ImageNet classification and COCO object detection tasks demonstrate that the proposed method can achieve over 5× HDEC improvement compared with non-optimization schemes, and also show that joint configuration of DQR and MCR can achieve over 1:2× HDEC improvement compared with the methods that only configure DQR or MCR.

Original languageEnglish
Pages (from-to)1954-1969
Number of pages16
JournalIEEE Transactions on Cloud Computing
Volume11
Issue number2
DOIs
StatePublished - Apr 1 2023

Bibliographical note

Publisher Copyright:
© 2022 IEEE.

Keywords

  • AI as a Service
  • delay-energy optimization
  • edge computing
  • resource allocation
  • task configuration

ASJC Scopus subject areas

  • Software
  • Information Systems
  • Hardware and Architecture
  • Computer Networks and Communications
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Edge AI as a Service: Configurable Model Deployment and Delay-Energy Optimization With Result Quality Constraints'. Together they form a unique fingerprint.

Cite this