摘要:
Over the last decade, the possibility of privacy leakage has become a growing concern. Differential privacy has emerged as the leading concept for both statistical data analysts and machine learning applications. However, the adoption of differential privacy constraints often incurs a trade-off between privacy loss and utility (or equivalently training accuracy).
In this talk, we will first present some works that establish an information-theoretic understanding of the fundamental limits that arise when imposing DP constraints for empirical minimization problems. Then, we will discuss differentially private stochastic gradient descent (DP-SGD), a widely used training procedure that involves adding random noise to achieve differential privacy. Using DP-SGD as a concrete example, we introduce general strategies proposed to enhance the privacy-utility trade-off such as data sampling, contractive iterations, data sketching, and post-processing. These strategies either add less amount of noises via less information exposure or alleviate the impact of added noise. We will also provide theories to support these strategies.