Adaptive Stochastic Gradient Markov Chain Monte Carlo Methods for Dynamic Learning and Network Embedding
Latent variable models are widely used in modern data science for both statistic and dynamic data. This thesis focuses on large-scale latent variable models formulated for time series data and static network data. The former refers to the state space model for dynamic systems, which models the evolution of latent state variables and the relationship between the latent state variables and observations. The latter refers to a network decoder model, which map a large network into a low-dimensional space of latent embedding vectors. Both problems can be solved by adaptive stochastic gradient Markov chain Monte Carlo (MCMC), which allows us to simulate the latent variables and estimate the model parameters in a simultaneous manner and thus facilitates the down-stream statistical inference from the data.
For the state space model, its challenge is on inference for high-dimensional, large scale and long series data. The existing algorithms, such as particle filter or sequential importance sampler, do not scale well to the dimension of the system and the sample size of the dataset, and often suffers from the sample degeneracy issue for long series data. To address the issue, the thesis proposes the stochastic approximation Langevinized ensemble Kalman filter (SA-LEnKF) for jointly estimating the states and unknown parameters of the dynamic system, where the parameters are estimated on the fly based on the state variables simulated by the LEnKF under the framework of stochastic approximation MCMC. Under mild conditions, we prove its consistency in parameter estimation and ergodicity in state variable simulations. The proposed algorithm can be used in uncertainty quantification for long series, large scale, and high-dimensional dynamic systems. Numerical results on simulated datasets and large real-world datasets indicate its superiority over the existing algorithms, and its great potential in statistical analysis of complex dynamic systems encountered in modern data science.
For the network embedding problem, an appropriate embedding dimension is hard to determine under the theoretical framework of the existing methods, where the embedding dimension is often considered as a tunable hyperparameter or a choice of common practice. The thesis proposes a novel network embedding method with a built-in mechanism for embedding dimension selection. The basic idea is to treat the embedding vectors as the latent inputs for a deep neural network (DNN) model. Then by an adaptive stochastic gradient MCMC algorithm, we can simulate of the embedding vectors and estimate the parameters of the DNN model in a simultaneous manner. By the theory of sparse deep learning, the embedding dimension can be determined via imposing an appropriate sparsity penalty on the DNN model. Experiments on real-world networks show that our method can perform dimension selection in network embedding and meanwhile preserve network structures.
History
Degree Type
- Doctor of Philosophy
Department
- Statistics
Campus location
- West Lafayette