Purdue University Graduate School
1-Sibendu_Paul_PhD_ECE_Final_Thesis.pdf (12.22 MB)


Download (12.22 MB)
posted on 2024-05-17, 15:23 authored by Sibendu PaulSibendu Paul

Recent advancements in deep learning (DL) and high-communication bandwidth access networks such as 5G enable applications that require intelligence and faster computational power at the edge with low power consumption. In this thesis, we study how to improve the Quality-of-Experience (QoE) of these emerging 5G applications, e.g., virtual reality (VR) and video analytics on edge devices. These 5G applications either require high-quality visual effects with a stringent latency requirement (for VR) or high analytics accuracy (for video analytics) while maintaining frame rate requirements under dynamic conditions. 

In part 1, we study how to support high-quality untethered immersive multiplayer VR on commodity mobile devices. Simply replicating the prior-art for a single-user VR will result in a linear increase in network bandwidth requirement that exceeds the bandwidth of WiFi (802.11ac). We propose a novel technique, Coterie, that splits the rendering of background environment (BE) frames between the mobile device and the edge server that drastically enhances the similarity of the BE frames and reduces the network load via frame caching and reuse. Our proposed VR framework, Coterie, reduces per-player network requirement by over 10x and easily supports 4 players on Pixel 2 over 802.11ac while maintaining the QoE constraints of 4K VR.

In part 2, we study how to achieve high accuracy of analytics in video analytics pipelines (VAP). We observe that the frames captured by the surveillance video cameras powering a variety of 24X7 analytics applications are not always pristine -- they can be distorted due to environmental condition changes, lighting issues, sensor noise, compression, etc. Such distortions not only deteriorate the accuracy of deep learning applications but also negatively impact the utilization of the edge server resources used to run these computationally expensive DL models. First, we study how to dynamically filter out low-quality frames captured. We propose a lightweight DL-based quality estimator, AQuA, that can be used to filter out low-quality frames that can lead to high-confidence errors (false-positives) if fed into the analytic units (AU) in the VAP. AQuA-filter reduces false positives by 17% and the compute and network usage by up to 27% when used in a face-recognition VAP. Second, we study how to reduce such poor-quality frame captures by the camera. We propose CamTuner, a system that automatically and dynamically adapts the complex camera settings to changing environmental conditions based on analytical quality estimation to enhance the accuracy of video analytics. In a real customer deployment, CamTuner enhances VAP accuracy by detecting 15.9% additional persons and 2.6%–4.2% additional cars (without any false positives) than the default camera setting. While CamTuner focuses on improving the accuracy of single-AU running on a camera stream, next we present Elixir, a system that enhances the video stream quality for multiple analytics on a video stream by jointly optimizing different AUs’ objectives. In a real-world deployment, Elixir correctly detects 7.1% (22,068) and 5.0% (15,731) more cars, 94% (551) and 72% (478) more faces, and 670.4% (4975) and 158.6% (3507) more persons than the default-camera-setting and time-sharing approaches, respectively.


Degree Type

  • Doctor of Philosophy


  • Electrical and Computer Engineering

Campus location

  • West Lafayette

Advisor/Supervisor/Committee Chair

Y. Charlie Hu

Advisor/Supervisor/Committee co-chair

Edward J. Delp

Additional Committee Member 2

David I. Inouye

Additional Committee Member 3

Vijay Raghunathan

Usage metrics



    Ref. manager