论文标题
通过像素级别和特征级别分配对准的黑盒攻击的一般对抗性防御
General Adversarial Defense Against Black-box Attacks via Pixel Level and Feature Level Distribution Alignments
论文作者
论文摘要
深度神经网络(DNNS)容易受到高度转移的黑盒对抗攻击的影响。这种威胁来自目标DNN的特征空间中对抗和干净样品之间的分布差距。在本文中,我们将深层生成网络(DGN)和新型的培训机制消除了分布差距。受过训练的DGN通过翻译像素值将对抗样品的分布与靶DNN的干净样本对齐。与以前的工作不同,我们提出了一个更有效的像素级训练约束,以使这一可实现,从而增强对对抗样本的鲁棒性。此外,为集成的分布对齐制定了类感知的特征级约束。我们的方法是一般的,适用于多个任务,包括图像分类,语义分割和对象检测。我们在不同数据集上进行了广泛的实验。我们的策略证明了其针对黑盒攻击的独特有效性和一般性。
Deep Neural Networks (DNNs) are vulnerable to the black-box adversarial attack that is highly transferable. This threat comes from the distribution gap between adversarial and clean samples in feature space of the target DNNs. In this paper, we use Deep Generative Networks (DGNs) with a novel training mechanism to eliminate the distribution gap. The trained DGNs align the distribution of adversarial samples with clean ones for the target DNNs by translating pixel values. Different from previous work, we propose a more effective pixel level training constraint to make this achievable, thus enhancing robustness on adversarial samples. Further, a class-aware feature-level constraint is formulated for integrated distribution alignment. Our approach is general and applicable to multiple tasks, including image classification, semantic segmentation, and object detection. We conduct extensive experiments on different datasets. Our strategy demonstrates its unique effectiveness and generality against black-box attacks.