论文标题
通过分布式同态加密更快的安全数据挖掘
Faster Secure Data Mining via Distributed Homomorphic Encryption
论文作者
论文摘要
由于数据挖掘的隐私需求不断上升,同型加密(HE)最近因其在加密字段上进行计算的能力而受到越来越多的关注。通过使用HE技术,可以将模型学习牢固地外包到不完全信任但强大的公共云计算环境中。但是,由于计算的复杂性很高,因此基于他的训练量表很差。是否可以将HE应用于大规模问题,仍然是一个悬而未决的问题。在本文中,我们提出了一个新型的一般分布基于HE的数据挖掘框架,以解决缩放问题的一个步骤。我们方法的主要思想是将更多的沟通开销用来交换HE中的较浅的计算电路,以降低整体复杂性。我们通过测试各种数据挖掘算法和基准数据集来验证新框架的效率和有效性。例如,我们成功地训练了逻辑回归模型,以在5分钟内识别数字3和8,而集中式同行需要将近2个小时。
Due to the rising privacy demand in data mining, Homomorphic Encryption (HE) is receiving more and more attention recently for its capability to do computations over the encrypted field. By using the HE technique, it is possible to securely outsource model learning to the not fully trustful but powerful public cloud computing environments. However, HE-based training scales badly because of the high computation complexity. It is still an open problem whether it is possible to apply HE to large-scale problems. In this paper, we propose a novel general distributed HE-based data mining framework towards one step of solving the scaling problem. The main idea of our approach is to use the slightly more communication overhead in exchange of shallower computational circuit in HE, so as to reduce the overall complexity. We verify the efficiency and effectiveness of our new framework by testing over various data mining algorithms and benchmark data-sets. For example, we successfully train a logistic regression model to recognize the digit 3 and 8 within around 5 minutes, while a centralized counterpart needs almost 2 hours.