论文标题
DeepSweep:使用数据增强来减轻DNN后门攻击的评估框架
DeepSweep: An Evaluation Framework for Mitigating DNN Backdoor Attacks using Data Augmentation
论文作者
论文摘要
已广泛采用公共资源和服务(例如数据集,培训平台,预培训模型),以减轻基于深度学习的应用程序的开发。但是,如果第三方提供商不受信任,则可以将中毒的样本注入数据集或这些模型中的后门。这样的诚信违规会导致严重的后果,尤其是在安全和关键安全应用中。已经提出了各种后门攻击技术,以提高有效性和隐身性。不幸的是,现有的防御解决方案不可行,以全面的方式挫败了这些攻击。 在本文中,我们研究了数据增强技术在减轻后门攻击和增强DL模型的鲁棒性方面的有效性。引入了评估框架以实现此目标。具体而言,我们考虑了一种统一的防御解决方案,该解决方案(1)采用数据增强政策来微调感染模型并消除嵌入式后门的影响; (2)使用另一种增强策略来预处理输入样本并在推理过程中使触发器无效。我们提出了一种系统的方法,通过全面评估71个最先进的数据增强功能,以发现防御不同后门攻击的最佳政策。广泛的实验表明,我们确定的政策可以有效地减轻八种不同类型的后门攻击,并且表现优于五种现有的防御方法。我们设想此框架可以是推进未来DNN后门研究的好基准测试工具。
Public resources and services (e.g., datasets, training platforms, pre-trained models) have been widely adopted to ease the development of Deep Learning-based applications. However, if the third-party providers are untrusted, they can inject poisoned samples into the datasets or embed backdoors in those models. Such an integrity breach can cause severe consequences, especially in safety- and security-critical applications. Various backdoor attack techniques have been proposed for higher effectiveness and stealthiness. Unfortunately, existing defense solutions are not practical to thwart those attacks in a comprehensive way. In this paper, we investigate the effectiveness of data augmentation techniques in mitigating backdoor attacks and enhancing DL models' robustness. An evaluation framework is introduced to achieve this goal. Specifically, we consider a unified defense solution, which (1) adopts a data augmentation policy to fine-tune the infected model and eliminate the effects of the embedded backdoor; (2) uses another augmentation policy to preprocess input samples and invalidate the triggers during inference. We propose a systematic approach to discover the optimal policies for defending against different backdoor attacks by comprehensively evaluating 71 state-of-the-art data augmentation functions. Extensive experiments show that our identified policy can effectively mitigate eight different kinds of backdoor attacks and outperform five existing defense methods. We envision this framework can be a good benchmark tool to advance future DNN backdoor studies.