File(s) under embargo
until file(s) become available
GENERAL-PURPOSE STATISTICAL INFERENCE WITH DIFFERENTIAL PRIVACY GUARANTEES
Differential privacy (DP) uses a probabilistic framework to measure the level of privacy protection of a mechanism that releases data analysis results to the public. Although DP is widely used by both government and industry, there is still a lack of research on statistical inference under DP guarantees. On the one hand, existing DP mechanisms mainly aim to extract dataset-level information instead of population-level information. On the other hand, DP mechanisms introduce calibrated noises into the released statistics, which often results in sampling distributions more complex and intractable than the non-private ones. This dissertation aims to provide general-purpose methods for statistical inference, such as confidence intervals (CIs) and hypothesis tests (HTs), that satisfy the DP guarantees.
In the first part of the dissertation, we examine a DP bootstrap procedure that releases multiple private bootstrap estimates to construct DP CIs. We present new DP guarantees for this procedure and propose to use deconvolution with DP bootstrap estimates to derive CIs for inference tasks such as population mean, logistic regression, and quantile regression. Our method achieves the nominal coverage level in both simulations and real-world experiments and offers the first approach to private inference for quantile regression.
In the second part of the dissertation, we propose to use the simulation-based ``repro sample'' approach to produce CIs and HTs based on DP statistics. Our methodology has finite-sample guarantees and can be applied to a wide variety of private inference problems. It appropriately accounts for biases introduced by DP mechanisms (such as by clamping) and improves over other state-of-the-art inference methods in terms of the coverage and type I error of the private inference.
In the third part of the dissertation, we design a debiased parametric bootstrap framework for DP statistical inference. We propose the adaptive indirect estimator, a novel simulation-based estimator that is consistent and corrects the clamping bias in the DP mechanisms. We also prove that our estimator has the optimal asymptotic variance among all well-behaved consistent estimators, and the parametric bootstrap results based on our estimator are consistent. Simulation studies show that our framework produces valid DP CIs and HTs in finite sample settings, and it is more efficient than other state-of-the-art methods.
Simulation-Based Inference for Differential Privacy
Directorate for Social, Behavioral & Economic SciencesFind out more...
Collaborative Research: Robust Deep Learning in Real Physical Space: Generalization, Scalability, and Credibility
Directorate for Mathematical & Physical SciencesFind out more...
Statistical Method and Theory for Privacy and Fairness in Trustworthy Artificial Intelligence
United States Department of the NavyFind out more...
- Doctor of Philosophy
- West Lafayette