联合学习中模型聚合的域差异意识蒸馏

论文标题

联合学习中模型聚合的域差异意识蒸馏

Domain Discrepancy Aware Distillation for Model Aggregation in Federated Learning

论文作者

Su, Shangchao, Li, Bin, Xue, Xiangyang

论文摘要

知识蒸馏最近已成为一种在服务器上进行联合学习的模型聚合方法的流行。通常认为服务器上有大量的公共未标记数据。但是，实际上，服务器域数据集和客户端域之间存在域差异，这限制了知识蒸馏的性能。如何在此类域差异设置下改善聚合仍然是一个开放的问题。在本文中，我们首先分析了客户域知识蒸馏产生的聚合模型的概括，然后描述了两个挑战，即服务器到客户之间的差异和客户对客户对客户至关重要的差异，这些挑战是由域差异带到了聚集模型的。经过分析，我们提出了一种基于域差异蒸馏的自适应知识聚集算法FedD3a，以降低界限。 FedD3A在FL的每一轮中在样品水平上执行自适应加权。对于服务器域中的每个示例，仅选择其相似域的客户端模型来扮演教师角色。为了实现这一目标，我们表明，可以使用在每个客户端上计算的子空间投影矩阵，而无需访问其原始数据，可以大致测量服务器端样本和客户端域之间的差异。因此，服务器可以利用来自多个客户端的投影矩阵，将权重分配给每个服务器端样本的相应教师模型。我们在两个受欢迎的跨域数据集中验证了FedD3A，并表明它的表现优于跨索洛和跨设备FL设置中比较的竞争对手。

Knowledge distillation has recently become popular as a method of model aggregation on the server for federated learning. It is generally assumed that there are abundant public unlabeled data on the server. However, in reality, there exists a domain discrepancy between the datasets of the server domain and a client domain, which limits the performance of knowledge distillation. How to improve the aggregation under such a domain discrepancy setting is still an open problem. In this paper, we first analyze the generalization bound of the aggregation model produced from knowledge distillation for the client domains, and then describe two challenges, server-to-client discrepancy and client-to-client discrepancy, brought to the aggregation model by the domain discrepancies. Following our analysis, we propose an adaptive knowledge aggregation algorithm FedD3A based on domain discrepancy aware distillation to lower the bound. FedD3A performs adaptive weighting at the sample level in each round of FL. For each sample in the server domain, only the client models of its similar domains will be selected for playing the teacher role. To achieve this, we show that the discrepancy between the server-side sample and the client domain can be approximately measured using a subspace projection matrix calculated on each client without accessing its raw data. The server can thus leverage the projection matrices from multiple clients to assign weights to the corresponding teacher models for each server-side sample. We validate FedD3A on two popular cross-domain datasets and show that it outperforms the compared competitors in both cross-silo and cross-device FL settings.

下载PDF全文

下载文献需遵守相关版权规定

论文标题