Mengyue_thesis.pdf (3.49 MB)
Download file

Graph Representation Learning for Unsupervised and Semi-supervised Learning Tasks

Download (3.49 MB)
thesis
posted on 19.12.2021, 17:50 by Mengyue HangMengyue Hang
Graph representation learning and Graph Neural Network (GNNs) models provide flexible tools for modeling and representing relational data (graphs) in various application domains. Specifically, node embedding methods provide continuous representations for vertices that has proved to be quite useful for prediction tasks, and Graph Neural Networks (GNNs) have recently been used for semi-supervised node and graph classification tasks with great success.
However, most node embedding methods for unsupervised tasks consider a simple, sparse graph, and are mostly optimized to encode aspects of the network structure (typically local connectivity) with random walks. And GNNs model dependencies among the attributes of nearby neighboring nodes rather than dependencies among observed node labels, which makes it not expressive enough for semi-supervised node classification tasks.
This thesis will investigate methods to address these limitations, including:

(1) For heterogeneous graphs: Development of a method for dense(r), heterogeneous graphs that incorporates global statistics into the negative sampling procedure with applications in recommendation tasks;
(2) For capturing long-range role equivalence: Formalized notions of representation-based equivalence w.r.t regular/automorphic equivalence in a single graph or multiple graph samples, which is employed in a embedding-based models to capture long-range equivalence patterns that reflect topological roles;
(3) For collective classification: Since GNNs model dependencies among the attributes of nearby neighboring nodes rather than dependencies among observed node labels, we develop an add-on collective learning framework to GNNs that provably boosts their expressiveness for node classification tasks, beyond that of an {\em optimal} WL-GNN, utilizing self-supervised learning and Monte Carlo sampled embeddings to incorporate node labels during inductive learning for semi-supervised node classification tasks.

History

Degree Type

Doctor of Philosophy

Department

Computer Science

Campus location

West Lafayette

Advisor/Supervisor/Committee Chair

Jennifer Neville

Additional Committee Member 2

Bruno Ribeiro

Additional Committee Member 3

Ming Yin

Additional Committee Member 4

Yexiang Xue

Usage metrics

Licence

Exports