医学图像分析系统的对抗攻击脆弱性：未开发的因素

论文标题

医学图像分析系统的对抗攻击脆弱性：未开发的因素

Adversarial Attack Vulnerability of Medical Image Analysis Systems: Unexplored Factors

论文作者

Bortsova, Gerda, González-Gonzalo, Cristina, Wetstein, Suzanne C., Dubost, Florian, Katramados, Ioannis, Hogeweg, Laurens, Liefers, Bart, van Ginneken, Bram, Pluim, Josien P. W., Veta, Mitko, Sánchez, Clara I., de Bruijne, Marleen

论文摘要

对抗性攻击被认为是机器学习系统的潜在严重安全威胁。由于强大的财务激励措施和相关的技术基础设施，医疗图像分析（媒体）系统最近被认为容易受到对抗攻击的影响。在本文中，我们研究了以前未开发的因素，影响了三个医学领域中深度学习媒体系统的对抗性攻击脆弱性：眼科，放射学和病理学。我们专注于对抗性黑框设置，其中攻击者无法完全访问目标模型，通常使用另一个模型（通常称为替代模型）来制作对抗性示例。我们认为这是媒体系统最现实的场景。首先，我们研究权重初始化（Imagenet与随机）对对抗性攻击从替代模型转移到目标模型的影响。其次，我们研究了目标和替代模型之间发展数据差异的影响。我们进一步研究了权重初始化和数据差异与模型结构差异的相互作用。所有实验均以调节的扰动度进行，以确保在攻击的最小视觉概念性下最大化可传递性。我们的实验表明，即使目标和替代物的体系结构不同时，训练可能会大大提高对抗性示例的可传递性：使用预训练的性能增长就越大，可传递性就越大。目标模型和替代模型之间的开发数据差异大大降低了攻击的性能；通过模型体系结构的差异进一步扩大了这种减少。我们认为，在开发计划在临床实践中部署的关键安全媒体系统时应考虑这些因素。

Adversarial attacks are considered a potentially serious security threat for machine learning systems. Medical image analysis (MedIA) systems have recently been argued to be vulnerable to adversarial attacks due to strong financial incentives and the associated technological infrastructure. In this paper, we study previously unexplored factors affecting adversarial attack vulnerability of deep learning MedIA systems in three medical domains: ophthalmology, radiology, and pathology. We focus on adversarial black-box settings, in which the attacker does not have full access to the target model and usually uses another model, commonly referred to as surrogate model, to craft adversarial examples. We consider this to be the most realistic scenario for MedIA systems. Firstly, we study the effect of weight initialization (ImageNet vs. random) on the transferability of adversarial attacks from the surrogate model to the target model. Secondly, we study the influence of differences in development data between target and surrogate models. We further study the interaction of weight initialization and data differences with differences in model architecture. All experiments were done with a perturbation degree tuned to ensure maximal transferability at minimal visual perceptibility of the attacks. Our experiments show that pre-training may dramatically increase the transferability of adversarial examples, even when the target and surrogate's architectures are different: the larger the performance gain using pre-training, the larger the transferability. Differences in the development data between target and surrogate models considerably decrease the performance of the attack; this decrease is further amplified by difference in the model architecture. We believe these factors should be considered when developing security-critical MedIA systems planned to be deployed in clinical practice.

下载PDF全文

下载文献需遵守相关版权规定

论文标题