Purdue University Graduate School
Browse

ENHANCING THE STOCHASTIC GRADIENT MARKOV CHAIN MONTE CARLO BY POPULATION

thesis
posted on 2025-04-26, 16:03 authored by Yushu HuangYushu Huang

The scalability of stochastic gradient MCMC algorithms has made them increasingly important for Bayesian analysis of big data. However, these algorithms are prone to get- ting trapped in local energy minima. One effective way to address this local trap issue is by involving a population of Markov chains to interact with each other, so that the global information is visible to the local chains, and therefore able to assist them getting out from the local traps. Two novel algorithms are proposed here: the evolutionary stochastic gradient Langevin dynamics (ESGLD) and the local Stochastic Particle-Optimization Sampling (LSPOS). ESGLD operates with a population of Markov chains, each at a different temper- ature, and these chains interact through a snooker operator. The snooker operator performs a short run of SGLD along the line connecting the current sample to a selected anchor point. As long as a mode is located by other chains in the population, the target chain can transition to it by running the snooker operator, particularly when the mode is sharp and well-separated from others. The snooker operator enables the Markov chain to effectively escape local traps while maintaining invariance with respect to the target distribution. This significantly accelerates convergence, especially when the energy landscape is rugged. The LSPOS is an improvement made on Stochastic Particle-Optimization Sampling (SPOS). In SPOS, each chain in the population runs with SGLD plus an interaction through the Stein variation operator. This operator guarantees the unbiasedness of the sample distribution to the target distribution. The usually used kernel function in the Stein variation operator in the SPOS is the radial based kernel function (RBF). In LSPOS, this kernel function is instead chosen to be a local truncation of the RBF. This change confines the interactions to only the neighborhood chains, and prevents a chain to learn from those distant chains that has fundamentally different local geometry, which doesn’t help and sometimes even impedes the chain to escape the local trap. We provide theoretical guarantees for the convergence of the ESGLD and LSPOS algorithms, and demonstrate their numerical effectiveness through simulated and real data examples.

Funding

NSF DMS-2015498

NSF DMS-2210819

NIH R01-GM152717

History

Degree Type

  • Doctor of Philosophy

Department

  • Statistics

Campus location

  • West Lafayette

Advisor/Supervisor/Committee Chair

Faming Liang

Advisor/Supervisor/Committee co-chair

Chuanhai Liu

Additional Committee Member 2

Vinayak Rao

Additional Committee Member 3

Qifan Song

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC