论文标题
差异私人多元中位数
Differentially private multivariate medians
论文作者
论文摘要
满足严格隐私保证的统计工具对于现代数据分析是必需的。众所周知,对污染的鲁棒性与差异隐私有关。尽管如此,尚未系统地研究使用多元中位数进行差异私有和健壮的多元位置估计。我们为差异化的基于多元深度的中位数开发了新颖的有限样本性能保证,这基本上是锋利的。我们的结果涵盖了常用的深度功能,例如半空间(或Tukey)深度,空间深度和集成的双重深度。我们表明,在凯奇(Cauchy)的边际上,重型位置估计的成本超过了隐私成本。我们使用高斯污染模型在d = 100的维度上使用高斯污染模型进行数值证明,并将其与最先进的私人平均估计算法进行比较。作为我们研究的副产品,我们证明了有关人口目标函数最大化的指数机制的输出的浓度不平等。该结合适用于满足轻度规律条件的目标函数。
Statistical tools which satisfy rigorous privacy guarantees are necessary for modern data analysis. It is well-known that robustness against contamination is linked to differential privacy. Despite this fact, using multivariate medians for differentially private and robust multivariate location estimation has not been systematically studied. We develop novel finite-sample performance guarantees for differentially private multivariate depth-based medians, which are essentially sharp. Our results cover commonly used depth functions, such as the halfspace (or Tukey) depth, spatial depth, and the integrated dual depth. We show that under Cauchy marginals, the cost of heavy-tailed location estimation outweighs the cost of privacy. We demonstrate our results numerically using a Gaussian contamination model in dimensions up to d = 100, and compare them to a state-of-the-art private mean estimation algorithm. As a by-product of our investigation, we prove concentration inequalities for the output of the exponential mechanism about the maximizer of the population objective function. This bound applies to objective functions that satisfy a mild regularity condition.