Purdue University Graduate School
Browse

STOCHASTIC NEURAL NETWORK AND CAUSAL INFERENCE

thesis
posted on 2025-01-10, 14:45 authored by Yaxin FangYaxin Fang

Estimating causal effects from observational data has been challenging due to high-dimensional complex dataset and confounding biases. In this thesis, we try to tackle these issues by leveraging deep learning techniques, including sparse deep learning and stochastic neural networks, that have been developed in recent literature.

With the advancement of data science, the collection of increasingly complex datasets has become commonplace. In such datasets, the data dimension can be extremely high, and the underlying data generation process can be unknown and highly nonlinear. As a result, the task of making causal inference with high-dimensional complex data has become a fundamental problem in many disciplines, such as medicine, econometrics, and social science. However, the existing methods for causal inference are frequently developed under the assumption that the data dimension is low or that the underlying data generation process is linear or approximately linear. To address these challenges, chapter 3 proposes a novel causal inference approach for dealing with high-dimensional complex data. By using sparse deep learning techniques, the proposed approach can address both the high dimensionality and unknown data generation process in a coherent way. Furthermore, the proposed approach can also be used when missing values are present in the datasets. Extensive numerical studies indicate that the proposed approach outperforms existing ones.

One of the major challenges in causal inference with observational data is handling missing confounder. Latent variable modeling is a valid framework to address this challenge, but current approaches within the framework often suffer from consistency issues in causal effect estimation and are hard to extend to more complex application scenarios. To bridge this gap, in chapter 4, we propose a new latent variable modeling approach. It utilizes a stochastic neural network, where the latent variables are imputed as the outputs of hidden neurons using an adaptive stochastic gradient HMC algorithm. Causal inference is then conducted based on the imputed latent variables. Under mild conditions, the new approach provides a theoretical guarantee for the consistency of causal effect estimation. The new approach also serves as a versatile tool for modeling various causal relationships, leveraging the flexibility of the stochastic neural network in natural process modeling. We show that the new approach matches state-of-the-art performance on benchmarks for causal effect estimation and demonstrate its adaptability to proxy variable and multiple-cause scenarios.

History

Degree Type

  • Doctor of Philosophy

Department

  • Statistics

Campus location

  • West Lafayette

Advisor/Supervisor/Committee Chair

Faming Liang

Additional Committee Member 2

Arman Sabbaghi

Additional Committee Member 3

Qifan Song

Additional Committee Member 4

Yichen Zhang

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC