Out of Distribution Representation Learning for Network System Forecasting

Gao, Jianfei

doi:10.25394/PGS.22589401.v1

Out_of_Distribution_Representation_Learning_for_Network_System_Forecasting.pdf (2.53 MB)

Out of Distribution Representation Learning for Network System Forecasting

thesis

posted on 2023-04-12, 12:45 authored by Jianfei GaoJianfei Gao

Representation learning algorithms, as the cutting edge of modern AIs, has shown their ability to automatically solve complex tasks in diverse fields including computer vision, speech recognition, autonomous driving, biology. Unsurprisingly, representation learning applications in computer networking domains, such as network management, video streaming, traffic forecasting, are enjoying increasing interests in recent years. However, the success of representation learning algorithms is based on consistency between training and test data distribution, which can not be guaranteed in some scenario due to resource limitation, privacy or other infrastructure reasons. Caused by distribution shift in training and test data, representation learning algorithms have to apply tuned models into environments whose data distribution are solidly different from the model training. This issue is addressed as Out-Of-Distribution (OOD) Generalization, and is still an open topic in machine learning. In this dissertation, I present solutions for OOD cases found in cloud services which will be beneficial to improve user experience. First, I implement Infinity SGD which can extrapolate from light-load server log to predict server performance under heavy-load. Infinity SGD builds the bridge between light-load and heavy-load server status through modeling server status under different loads by an unified Continuous Time Markov Chain (CTMC) of same parameters. I show that Infinity SGD can perform extrapolations that no precedent works can do on real-world testbed and synthetic experiments. Next, I propose Veritas, a framework to answer what will be the user experience if a different ABR, a kind of video streaming data transfer algorithm, was used with the same server, client and connection status. Veritas strictly follows Structural Causal Model (SCM) which guarantees its power to answer what-if counterfactual and interventional questions for video streaming. I showcase that Veritas can accurately answer confounders for what-if questions on real-world emulations where on existing works can. Finally, I propose time-then-graph, a provable more expressive temporal graph neural network (TGNN) than precedent works. We empirically show that time-then-graph is a more efficient and accurate framework on forecasting traffic on network data which will serve as an essential input data for Infinity SGD. Besides, paralleling with this dissertation, I formalize Knowledge Graph (KG) as doubly exchangeable attributed graph. I propose a doubly exchangeable representation blueprint based on the formalization which enables a complex logical reasoning task with no precedent works. This work may also find potential traffic classification applications in networking field.

History

Degree Type

Doctor of Philosophy

Department

Computer Science

Campus location

West Lafayette

Advisor/Supervisor/Committee Chair

Bruno Felisberto Martins Ribeiro

Additional Committee Member 2

Sonia A. Fahmy

Additional Committee Member 3

Sanjay G. Rao

Additional Committee Member 4

Pan Li

Usage metrics

Keywords

Computer Science Machine Learning Deep Learning Representation Learning Out of Distribution

Licence

CC BY 4.0

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Out of Distribution Representation Learning for Network System Forecasting

History

Degree Type

Department

Campus location

Advisor/Supervisor/Committee Chair

Additional Committee Member 2

Additional Committee Member 3

Additional Committee Member 4

Usage metrics

Categories

Keywords

Licence

Exports