Purdue University Graduate School

File(s) under embargo

Reason: Journal Submission







until file(s) become available

Deep Reinforcement Learning of IoT System Dynamics  for Optimal Orchestration and Boosted Efficiency

posted on 2023-08-30, 20:18 authored by Haowei ShiHaowei Shi

This thesis targets the orchestration challenge of the Wearable Internet of Things (IoT) systems, for optimal configurations of the system in terms of energy efficiency, computing, and  data transmission activities. We have firstly investigated the reinforcement learning on the  simulated IoT environments to demonstrate its effectiveness, and afterwards studied the algorithm  on the real-world wearable motion data to show the practical promise. More specifically, firstly,  challenge arises in the complex massive-device orchestration, meaning that it is essential to  configure and manage the massive devices and the gateway/server. The complexity on the massive  wearable IoT devices, lies in the diverse energy budget, computing efficiency, etc. On the phone  or server side, it lies in how global diversity can be analyzed and how the system configuration  can be optimized. We therefore propose a new reinforcement learning architecture, called boosted  deep deterministic policy gradient, with enhanced actor-critic co-learning and multi-view state?transformation. The proposed actor-critic co-learning allows for enhanced dynamics abstraction  through the shared neural network component. Evaluated on a simulated massive-device task, the proposed deep reinforcement learning framework has achieved much more efficient system  configurations with enhanced computing capabilities and improved energy efficiency. Secondly, we have leveraged the real-world motion data to demonstrate the potential of leveraging  reinforcement learning to optimally configure the motion sensors. We used paradigms in  sequential data estimation to obtain estimated data for some sensors, allowing energy savings since  these sensors no longer need to be activated to collect data for estimation intervals. We then  introduced the Deep Deterministic Policy Gradient algorithm to learn to control the estimation  timing. This study will provide a real-world demonstration of maximizing energy efficiency wearable IoT applications while maintaining data accuracy. Overall, this thesis will greatly  advance the wearable IoT system orchestration for optimal system configurations.   


Degree Type

  • Master of Science


  • Electrical and Computer Engineering

Campus location

  • Indianapolis

Advisor/Supervisor/Committee Chair

Qingxue Zhang

Additional Committee Member 2

Brian King

Additional Committee Member 3

Shiaofen Fang

Usage metrics


    Ref. manager