Purdue University Graduate School
Browse

File(s) under embargo

4

month(s)

18

day(s)

until file(s) become available

SCALABLE BAYESIAN METHODS FOR PROBABILISTIC GRAPHICAL MODELS

thesis
posted on 2024-04-25, 15:40 authored by Chuan ZuoChuan Zuo

In recent years, probabilistic graphical models have emerged as a powerful framework for understanding complex dependencies in multivariate data, offering a structured approach to tackle uncertainty and model complexity. These models have revolutionized the way we interpret the interplay between variables in various domains, from genetics to social network analysis. Inspired by the potential of probabilistic graphical models to provide insightful data analysis while addressing the challenges of high-dimensionality and computational efficiency, this dissertation introduces two novel methodologies that leverage the strengths of graphical models in high-dimensional settings. By integrating advanced inference techniques and exploiting the structural advantages of graphical models, we demonstrate how these approaches can efficiently decode complex data patterns, offering significant improvements over traditional methods. This work not only contributes to the theoretical advancements in the field of statistical data analysis but also provides practical solutions to real-world problems characterized by large-scale, complex datasets.

Firstly, we introduce a novel Bayesian hybrid method for learning the structure of Gaus- sian Bayesian Networks (GBNs), addressing the critical challenge of order determination in constraint-based and score-based methodologies. By integrating a permutation matrix within the likelihood function, we propose a technique that remains invariant to data shuffling, thereby overcoming the limitations of traditional approaches. Utilizing Cholesky decompo- sition, we reparameterize the log-likelihood function to facilitate the identification of the parent-child relationship among nodes without relying on the faithfulness assumption. This method efficiently manages the permutation matrix to optimize for the sparsest Cholesky factor, leveraging the Bayesian Information Criterion (BIC) for model selection. Theoretical analysis and extensive simulations demonstrate the superiority of our method in terms of precision, recall, and F1-score across various network complexities and sample sizes. Specifically, our approach shows significant advantages in small-n-large-p scenarios, outperforming existing methods in detecting complex network structures with limited data. Real-world applications on datasets such as ECOLI70, ARTH150, MAGIC-IRRI, and MAGIC-NIAB further validate the effectiveness and robustness of our proposed method. Our findings contribute to the field of Bayesian network structure learning by providing a scalable, efficient, and reliable tool for modeling high-dimensional data structures.

Secondly, we introduce a Bayesian methodology tailored for Gaussian Graphical Models (GGMs) that bridges the gap between GBNs and GGMs. Utilizing the Cholesky decomposition, we establish a novel connection that leverages estimated GBN structures to accurately recover and estimate GGMs. This innovative approach benefits from a theoretical foundation provided by a theorem that connects sparse priors on Cholesky factors with the sparsity of the precision matrix, facilitating effective structure recovery in GGMs. To assess the efficacy of our proposed method, we conduct comprehensive simulations on AR2 and circle graph models, comparing its performance with renowned algorithms such as GLASSO, CLIME, and SPACE across various dimensions. Our evaluation, based on metrics like estimation ac- curacy and selection correctness, unequivocally demonstrates the superiority of our approach in accurately identifying the intrinsic graph structure. The empirical results underscore the robustness and scalability of our method, underscoring its potential as an indispensable tool for statistical data analysis, especially in the context of complex datasets.

History

Degree Type

  • Doctor of Philosophy

Department

  • Statistics

Campus location

  • West Lafayette

Advisor/Supervisor/Committee Chair

Faming Liang

Additional Committee Member 2

Kiseop Lee

Additional Committee Member 3

Qifan Song

Additional Committee Member 4

Anindya Bhadra

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC