论文标题
从对抗性示例的傅立叶域的角度来看,到Wiener滤波器防御语义段
From a Fourier-Domain Perspective on Adversarial Examples to a Wiener Filter Defense for Semantic Segmentation
论文作者
论文摘要
尽管最近进步,深层神经网络对对抗性扰动并不强大。许多拟议的对抗防御方法都使用计算昂贵的训练机制,这些机制不会扩展到复杂的现实世界任务,例如语义细分,并且仅提供边际改进。此外,关于对抗性扰动的性质及其与网络体系结构的关系的基本问题在很大程度上得到了研究。在这项工作中,我们从频域的角度研究了对抗性问题。更具体地说,我们分析了几个对抗图像的离散傅立叶变换(DFT)光谱,并报告两个主要发现:首先,模型架构与对抗性扰动的性质之间存在很强的联系,可以在频域中观察和解决。其次,观察到的频率模式在很大程度上是独立的图像和攻击型,这对于任何使用此类模式的防御影响都很重要。在这些发现的激励下,我们还基于众所周知的维纳尔过滤器提出了一种对抗防御方法,该方法以数据驱动的方式捕获和抑制了对抗频率。我们提出的方法不仅在看不见的攻击中概括了,而且在各种攻击环境中的两个模型中都击败了五种现有的最新方法。
Despite recent advancements, deep neural networks are not robust against adversarial perturbations. Many of the proposed adversarial defense approaches use computationally expensive training mechanisms that do not scale to complex real-world tasks such as semantic segmentation, and offer only marginal improvements. In addition, fundamental questions on the nature of adversarial perturbations and their relation to the network architecture are largely understudied. In this work, we study the adversarial problem from a frequency domain perspective. More specifically, we analyze discrete Fourier transform (DFT) spectra of several adversarial images and report two major findings: First, there exists a strong connection between a model architecture and the nature of adversarial perturbations that can be observed and addressed in the frequency domain. Second, the observed frequency patterns are largely image- and attack-type independent, which is important for the practical impact of any defense making use of such patterns. Motivated by these findings, we additionally propose an adversarial defense method based on the well-known Wiener filters that captures and suppresses adversarial frequencies in a data-driven manner. Our proposed method not only generalizes across unseen attacks but also beats five existing state-of-the-art methods across two models in a variety of attack settings.