CAREER: Integrated and End-to-end Machine Learning Pipeline for Edge-enabled IoT Systems: A Resource-aware and QoS-aware Perspective

Grants and Contracts Details


Overview. The daily data generated by Internet-of-Things (IoT) devices is immense and holds valuable knowledge for training advanced deep learning models. These models power automated systems like selfdriving cars, which rely on object detection to make driving decisions. However, extracting comprehensive knowledge from this vast data is challenging due to the computational costs of training deep learning models. Additionally, privacy concerns require deep learning techniques to extract knowledge without compromising user data privacy. To address these challenges, federated learning (FL) has gained popularity. FL decentralizes the training process by having IoT devices communicate only their locally trained model parameters to a central server. However, conventional 2-tier architectures for FL are inadequate given the scale of the IoT. The emergence of edge computing (EC), which brings computation closer to end users, presents an opportunity for FL advancement. Yet, most existing FL works assume static data on IoT devices, accurate labeling, and sufficient local compute resources, which are unrealistic in real-world IoT settings. Moreover, IoT data often requires pre-processing before being fed into learning models. Due to IoT’s limited resources and communication constraints, pre-processing should ideally be performed on the devices themselves rather than in the cloud. This necessitates an integrated approach that considers both data pre-processing and model training to optimize decisions regarding where and how to pre-process the data and how to orchestrate model training on the IoT systems. This work focuses on designing integrated, reliable and efficient resource-aware federated data preprocessing and learning processes within a 3-tier IoT-Edge-Cloud platform. The platform comprises a central cloud, edge servers, and IoT devices, taking into account data and resource heterogeneity, which can change over time. The emphasis is on balancing performance and cost trade-offs in the proposed FL processes. Intellectual Merit. This project aims to design efficient and reliable integrated processes for federated data pre-processing and federated learning in 3-tier IoT-Edge-Cloud platforms. It focuses on resource management in dynamic and heterogeneous IoT systems, where devices have varying data and system resources. Previous works often made simplifying assumptions or focused on only one aspect of the process. To support reliable and automated federated learning processes, these assumptions need to be abandoned. The research will develop an efficient and automated federated data analysis pipeline, considering the performance-cost trade-offs in dynamic IoT systems. Coding and compression techniques will be explored to reduce communication and computation costs. Additionally, reliable federated learning processes will be designed to handle impact of failures, and straggler nodes in the system. The project will apply stochastic optimization techniques and Markov Decision Process modeling to provide theoretical upper bounds for the problems. Model-free reinforcement learning algorithms will be used to solve these models, such as Proximal Policy Optimization. Novel algorithms and heuristics will be developed to provide efficient and scalable approximations of optimal solutions for real-world IoT systems. Extensive evaluation will be performed using simulations and state-of-the-art deep learning and network modeling tools to quantify the benefits of the proposed solutions compared with the baselines and state-of-the-art algorithms. Broader Impact. The proposal contains integrated education, outreach, and dissemination plans that consist of various interconnected components, including teaching courses related to the proposed research, lunch-and-learns with underrepresented undergraduate students, hands-on activities and outreach programs to (underrepresented) K-12 students to attract more students to the science, collaborative undergraduate summer research opportunities for underrepresented minorities, international collaboration, and the hiring of a female PhD student. As part of this proposal, key findings will be shared among the academic community through publishing in international journal and conference venues, online videos posted on YouTube, presented in departmental seminars, and discussed in blog posts on the PI’s webpage for her lab group. PI will also organize and chair 2 workshops as part of INFOCOM and ICML conferences to help with joining different communities of computer science and setting new agenda for the field. This work will make active steps to reach out to underrepresented communities in hopes of expanding the diversity of future technologists and scientists, including advancing the career of a junior female faculty member. Keywords. Federated learning; Heterogeneous IoT Systems; Edge computing; Resource management
Effective start/end date3/1/242/28/29


  • National Science Foundation: $252,913.00


Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.