A study of the prediction performance and multivariate extensions of the horseshoe estimator

Li, Yunfan

doi:10.25394/PGS.8029262.v1

thesis_final.pdf (2.59 MB)

A study of the prediction performance and multivariate extensions of the horseshoe estimator

thesis

posted on 2019-05-14, 18:20 authored by Yunfan LiYunfan Li

The horseshoe prior has been shown to successfully handle high-dimensional sparse estimation problems. It both adapts to sparsity efficiently and provides nearly unbiased estimates for large signals. In addition, efficient sampling algorithms have been developed and successively applied to a vast array of high-dimensional sparse estimation problems. In this dissertation, we investigate the prediction performance of the horseshoe prior in sparse regression, and extend the horseshoe prior to two multivariate settings.

We begin with a study of the finite sample prediction performance of shrinkage regression methods, where the risk can be unbiasedly estimated using Stein's approach. We show that the horseshoe prior achieves an improved prediction risk over global shrinkage rules, by using a component-specific local shrinkage term that is learned from the data under a heavy-tailed prior, in combination with a global term providing shrinkage towards zero. We demonstrate improved prediction performance in a simulation study and in a pharmacogenomics data set, confirming our theoretical findings.

We then shift to extending the horseshoe prior to handle two high-dimensional multivariate problems. First, we develop a new estimator of the inverse covariance matrix for high-dimensional multivariate normal data. The proposed graphical horseshoe estimator has attractive properties compared to other popular estimators. The most prominent benefit is that when the true inverse covariance matrix is sparse, the graphical horseshoe estimator provides estimates with small information divergence from the sampling model. The posterior mean under the graphical horseshoe prior can also be almost unbiased under certain conditions. In addition to these theoretical results, we provide a full Gibbs sampler for implementation. The graphical horseshoe estimator compares favorably to existing techniques in simulations and in a human gene network data analysis.

In our second setting, we apply the horseshoe prior to the joint estimation of regression coefficients and the inverse covariance matrix in normal models. The computational challenge in this problem is due to the dimensionality of the parameter space that routinely exceeds the sample size. We show that the advantages of the horseshoe prior in estimating a mean vector, or an inverse covariance matrix, separately are also present when addressing both simultaneously. We propose a full Bayesian treatment, with a sampling algorithm that is linear in the number of predictors. Extensive performance comparisons are provided with both frequentist and Bayesian alternatives, and both estimation and prediction performances are verified on a genomic data set.

History

Degree Type

Doctor of Philosophy

Department

Statistics

Campus location

West Lafayette

Advisor/Supervisor/Committee Chair

Anindya Bhadra

Advisor/Supervisor/Committee co-chair

Bruce A. Craig

Additional Committee Member 2

Jun Xie

Additional Committee Member 3

Michael Zhu

Usage metrics

Keywords

Bayesian statistical model multivariate analysis Gaussian graphical model (GGM)Statistics

Licence

CC BY 4.0

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

A study of the prediction performance and multivariate extensions of the horseshoe estimator

History

Degree Type

Department

Campus location

Advisor/Supervisor/Committee Chair

Advisor/Supervisor/Committee co-chair

Additional Committee Member 2

Additional Committee Member 3

Usage metrics

Categories

Keywords

Licence

Exports