IMPROVING THE UTILITY OF DIFFERENTIALLY PRIVATE ALGORITHMS USING DATA CHARACTERISTICS
As data continues to grow rapidly in volume and complexity, there is an increasing need to extract meaningful insights from it. These datasets often contain sensitive individual information, making privacy protection crucial. Differential privacy has become the de facto standard for protecting individuals' privacy. Many datasets also have known constraints and structures. Can these known constraints or structures be leveraged to design mechanisms with better utility?
The focus of this thesis is to demonstrate that by leveraging the inherent structures and constraints within datasets, it may be possible to design differential privacy mechanisms that offer better utility (i.e., more accurate results) while maintaining the required level of privacy. This involves exploring advanced techniques and modifications to the basic mechanisms that take advantage of dataset-specific properties, such as sparsity, distributional assumptions, or other contextual information. This approach aims to minimize the noise added, thereby improving the utility of differentially private outputs.
In many scenarios, datasets contain constraints. In this thesis, we show that generating differentially private synthetic data while preserving constraints increases utility across several metrics, including marginal queries, classification task accuracy, and clustering. Smooth sensitivity is a data-dependent sensitivity metric that allows for more precise noise addition based on the actual data distribution, rather than worst-case scenarios. It addresses the limitations of local sensitivity by ensuring robust privacy guarantees, even in the presence of outliers or small changes in the data.
We have developed a differentially private Naive Bayes model using smooth sensitivity. By using data-dependent sensitivity measures like smooth sensitivity and incorporating known data constraints, we can reduce the amount of noise added, resulting in a more accurate model.
History
Degree Type
- Doctor of Philosophy
Department
- Computer Science
Campus location
- West Lafayette