Graph Learning at Scale: Algorithms, Systems, and Applications
Graph-structured data capture complex relationships and interactions between entities, offering valuable insights for scientific discovery, business modeling, and AI-driven decision-making. Despite its transformative potential, learning on graphs faces two key challenges: (1) scaling expressive learning approaches, especially subgraph-based graph representation learning, and (2) ensuring privacy when handling sensitive relational data. Both challenges arise from intricate dependencies in graph structures, which limit the effectiveness of canonical algorithms and system optimizations. This dissertation addresses these challenges through a unified framework that integrates system-aware algorithm design across two main thrusts.
In Thrust I, we develop a family of efficient frameworks for expressive graph representation learning that eliminate redundancy in subgraph-based methods. By decoupling dependencies over task-specific input features (i.e., query-induced subgraphs), the proposed paradigm enables efficient higher-order pattern discovery, scalable network analysis on billion-edge graphs, and low-latency online inference using reusable, task-agnostic features derived from random walks, node-set sampling, and neighborhood hashing. In Thrust II, we extend the design principle to privacy-preserving relational learning, where structural dependencies in graphs often violate the gradient decoupling assumption in standard privacy learning mechanisms like differentially private stochastic gradient descent (DP-SGD). We propose the first differential private relational learning framework that disentangles sample dependencies through a tailored DP-SGD approach. This framework enables the private fine-tuning of large language models (LLMs) on sensitive graph data, effectively addressing associated computational complexities while achieving strong privacy-utility trade-offs.
By co-designing learning algorithms and system implementations, this dissertation demonstrates how graph-based AI can be both scalable and trustworthy, opening new avenues for learning from complex structured data in real-world applications.
Funding
NSF PHY-2117997
NSF CCF-2402816
NSF IIS-2239565
History
Degree Type
- Doctor of Philosophy
Department
- Computer Science
Campus location
- West Lafayette