迈向分布的对抗性鲁棒性

论文标题

迈向分布的对抗性鲁棒性

Towards Out-of-Distribution Adversarial Robustness

论文作者

Ibrahim, Adam, Guille-Escuret, Charles, Mitliagkas, Ioannis, Rish, Irina, Krueger, David, Bashivan, Pouya

论文摘要

对抗性的鲁棒性仍然是深度学习的主要挑战。一个核心问题是，对一种攻击的鲁棒性通常无法转移到其他攻击中。虽然先前的工作在与不同的$ L_P $规范上建立了鲁棒性的理论权衡，但我们表明，通过采用域泛化方法，有可能改善许多常用的攻击。具体而言，我们将每种类型的攻击视为一个领域，并应用风险外推法（REX），该方法促进了针对所有训练攻击的相似鲁棒性。与现有方法相比，我们在训练过程中获得的攻击方面获得了相似或出色的最坏情况的对抗性鲁棒性。此外，我们在仅在测试时遇到的家庭或攻击的调音方面取得了卓越的表现。关于攻击的集合，我们的方法将准确性从3.4％提高，最佳现有基线提高到MNIST的25.9％，而CIFAR10的准确性从16.9％提高到23.5％。

Adversarial robustness continues to be a major challenge for deep learning. A core issue is that robustness to one type of attack often fails to transfer to other attacks. While prior work establishes a theoretical trade-off in robustness against different $L_p$ norms, we show that there is potential for improvement against many commonly used attacks by adopting a domain generalisation approach. Concretely, we treat each type of attack as a domain, and apply the Risk Extrapolation method (REx), which promotes similar levels of robustness against all training attacks. Compared to existing methods, we obtain similar or superior worst-case adversarial robustness on attacks seen during training. Moreover, we achieve superior performance on families or tunings of attacks only encountered at test time. On ensembles of attacks, our approach improves the accuracy from 3.4% with the best existing baseline to 25.9% on MNIST, and from 16.9% to 23.5% on CIFAR10.

下载PDF全文

下载文献需遵守相关版权规定

论文标题