论文标题
使用基于集的人工指纹学习生成模型的强大表示
Learning Robust Representations Of Generative Models Using Set-Based Artificial Fingerprints
论文作者
论文摘要
随着深层生成模型的最新进展,出于各种原因,识别合成数据并比较其基本生成过程的问题已成为急需的任务,包括对抗视觉错误信息和源归因。现有方法通常通过模型之间的样品分布近似于模型之间的距离。在本文中,我们通过学习编码生成模型留下的残留伪像作为识别源模型的独特信号来解决指纹生成模型的问题。我们将这些独特的痕迹(又称“人工指纹”)视为生成模型的表示,并证明了它们在源归因的歧视性任务和定义基本模型之间相似性的无监督任务中的有用性。我们首先将有关GAN指纹的现有研究扩展到四个代表性类别的生成模型(VAE,FLOW,GAN和基于分数的模型),并证明它们的存在和归因性。然后,我们通过提出基于设定编码和对比度训练的新学习方法来提高指纹的稳定性和归因性。与在单个图像上操作的现有方法不同,我们的设置编码器从图像的\ textit {set}学习指纹。我们通过与最先进的指纹方法和消融研究进行了比较,证明了稳定性和归因性的改善。此外,我们的方法还采用对比培训来学习模型之间的隐含相似性。我们在标准分层聚类算法中使用该指标发现了潜在的生成模型家族。
With recent progress in deep generative models, the problem of identifying synthetic data and comparing their underlying generative processes has become an imperative task for various reasons, including fighting visual misinformation and source attribution. Existing methods often approximate the distance between the models via their sample distributions. In this paper, we approach the problem of fingerprinting generative models by learning representations that encode the residual artifacts left by the generative models as unique signals that identify the source models. We consider these unique traces (a.k.a. "artificial fingerprints") as representations of generative models, and demonstrate their usefulness in both the discriminative task of source attribution and the unsupervised task of defining a similarity between the underlying models. We first extend the existing studies on fingerprints of GANs to four representative classes of generative models (VAEs, Flows, GANs and score-based models), and demonstrate their existence and attributability. We then improve the stability and attributability of the fingerprints by proposing a new learning method based on set-encoding and contrastive training. Our set-encoder, unlike existing methods that operate on individual images, learns fingerprints from a \textit{set} of images. We demonstrate improvements in the stability and attributability through comparisons to state-of-the-art fingerprint methods and ablation studies. Further, our method employs contrastive training to learn an implicit similarity between models. We discover latent families of generative models using this metric in a standard hierarchical clustering algorithm.