论文标题

差异私人引导程序:新的隐私分析和推理策略

Differentially Private Bootstrap: New Privacy Analysis and Inference Strategies

论文作者

Wang, Zhanyu, Cheng, Guang, Awan, Jordan

论文摘要

通过将随机性引入统计分析程序来保护个人级信息,以差异化(DP)机制来保护个体级别的信息。尽管有许多DP工具可用,但仍缺乏在DP下进行统计推断的一般技术。我们检查了DP引导程序,该过程释放了多个私人引导估算,以推断采样分布并构建置信区间(CIS)。我们的隐私分析介绍了适用于任何DP机制的单个DP引导估算的隐私成本的新结果,并在现有文献中确定了对引导程序的一些错误应用。对于DP Bootstrap的组成,我们提出了一种数值方法,用于计算释放多个DP Bootstrap估计的确切隐私成本,并使用Gaussian-DP(GDP)框架(Dong等,2022),我们显示,从满足$ B $ DP BOOTTRAP估计的机制中的释放$(μ/\ sqrt {(2-2/\ mathrm {e})b})$ - gdp渐近地满足$μ$ -gdp,at $ b $ to to in infinity。然后,我们通过后处理DP引导程序估算来执行私人统计推断。我们证明我们的点估计值是一致的,我们的标准CI渐近有效,并且均具有最佳的收敛速率。为了进一步提高有限性能,我们将反卷积与DP引导估算估算进行准确推断采样分布。我们为诸如人群平均估计,逻辑回归和分数回归等任务提供了CI,并使用模拟和现实世界进行了比较,并在2016年加拿大加拿大人口普查数据上进行了比较。我们的私人CI达到了名义覆盖水平,并为分位数回归提供了第一种私人推断的方法。

Differentially private (DP) mechanisms protect individual-level information by introducing randomness into the statistical analysis procedure. Despite the availability of numerous DP tools, there remains a lack of general techniques for conducting statistical inference under DP. We examine a DP bootstrap procedure that releases multiple private bootstrap estimates to infer the sampling distribution and construct confidence intervals (CIs). Our privacy analysis presents new results on the privacy cost of a single DP bootstrap estimate, applicable to any DP mechanism, and identifies some misapplications of the bootstrap in the existing literature. For the composition of the DP bootstrap, we present a numerical method to compute the exact privacy cost of releasing multiple DP bootstrap estimates, and using the Gaussian-DP (GDP) framework (Dong et al., 2022), we show that the release of $B$ DP bootstrap estimates from mechanisms satisfying $(μ/\sqrt{(2-2/\mathrm{e})B})$-GDP asymptotically satisfies $μ$-GDP as $B$ goes to infinity. Then, we perform private statistical inference by post-processing the DP bootstrap estimates. We prove that our point estimates are consistent, our standard CIs are asymptotically valid, and both enjoy optimal convergence rates. To further improve the finite performance, we use deconvolution with DP bootstrap estimates to accurately infer the sampling distribution. We derive CIs for tasks such as population mean estimation, logistic regression, and quantile regression, and we compare them to existing methods using simulations and real-world experiments on 2016 Canada Census data. Our private CIs achieve the nominal coverage level and offer the first approach to private inference for quantile regression.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源