Purdue University Graduate School
Browse

File(s) under embargo

1

year(s)

6

month(s)

17

day(s)

until file(s) become available

GENERAL-PURPOSE STATISTICAL INFERENCE WITH DIFFERENTIAL PRIVACY GUARANTEES

thesis
posted on 2023-12-06, 19:35 authored by Zhanyu WangZhanyu Wang

Differential privacy (DP) uses a probabilistic framework to measure the level of privacy protection of a mechanism that releases data analysis results to the public. Although DP is widely used by both government and industry, there is still a lack of research on statistical inference under DP guarantees. On the one hand, existing DP mechanisms mainly aim to extract dataset-level information instead of population-level information. On the other hand, DP mechanisms introduce calibrated noises into the released statistics, which often results in sampling distributions more complex and intractable than the non-private ones. This dissertation aims to provide general-purpose methods for statistical inference, such as confidence intervals (CIs) and hypothesis tests (HTs), that satisfy the DP guarantees.

In the first part of the dissertation, we examine a DP bootstrap procedure that releases multiple private bootstrap estimates to construct DP CIs. We present new DP guarantees for this procedure and propose to use deconvolution with DP bootstrap estimates to derive CIs for inference tasks such as population mean, logistic regression, and quantile regression. Our method achieves the nominal coverage level in both simulations and real-world experiments and offers the first approach to private inference for quantile regression.

In the second part of the dissertation, we propose to use the simulation-based ``repro sample'' approach to produce CIs and HTs based on DP statistics. Our methodology has finite-sample guarantees and can be applied to a wide variety of private inference problems. It appropriately accounts for biases introduced by DP mechanisms (such as by clamping) and improves over other state-of-the-art inference methods in terms of the coverage and type I error of the private inference.

In the third part of the dissertation, we design a debiased parametric bootstrap framework for DP statistical inference. We propose the adaptive indirect estimator, a novel simulation-based estimator that is consistent and corrects the clamping bias in the DP mechanisms. We also prove that our estimator has the optimal asymptotic variance among all well-behaved consistent estimators, and the parametric bootstrap results based on our estimator are consistent. Simulation studies show that our framework produces valid DP CIs and HTs in finite sample settings, and it is more efficient than other state-of-the-art methods.

Funding

Simulation-Based Inference for Differential Privacy

Directorate for Social, Behavioral & Economic Sciences

Find out more...

Collaborative Research: Robust Deep Learning in Real Physical Space: Generalization, Scalability, and Credibility

Directorate for Mathematical & Physical Sciences

Find out more...

Statistical Method and Theory for Privacy and Fairness in Trustworthy Artificial Intelligence

United States Department of the Navy

Find out more...

History

Degree Type

  • Doctor of Philosophy

Department

  • Statistics

Campus location

  • West Lafayette

Advisor/Supervisor/Committee Chair

Jordan Awan

Advisor/Supervisor/Committee co-chair

Guang Cheng

Additional Committee Member 2

Vinayak Rao

Additional Committee Member 3

Christopher W. Clifton (Clifton W. Bingham)