GANMEX：由基于GAN的反事实解释基线指导的一VS-One属性

论文标题

GANMEX：由基于GAN的反事实解释基线指导的一VS-One属性

GANMEX: One-vs-One Attributions Guided by GAN-based Counterfactual Explanation Baselines

论文作者

Shih, Sheng-Min, Tien, Pin-Ju, Karnin, Zohar

论文摘要

归因方法已被证明是识别导致学习模型预测的关键特征的有希望的方法。尽管大多数现有的归因方法依赖于基线输入来执行特征扰动，但已经进行了有限的研究来解决基线选择问题。基准的不良选择限制了多类分类器对单VS-ONE（1-VS-1）解释的能力，这意味着属性方法无法解释为什么输入属于其原始类，而不是其他指定的目标类。当某些类别比其他类别更相似时，1-VS-1解释至关重要，例如多种动物中有两种鸟类类型，通过关注关键的区分特征，而不是各个班级的共享特征。在本文中，我们提出了基于GAN的模型解释性（GANMEX），这是一种新颖的方法，该方法通过将要解释的分类器作为对抗性网络的一部分合并，应用生成对抗网络（GAN）。我们的方法有效地选择了反事实基线作为最接近的现实样本属于目标类，允许归因方法提供真正的1-VS-1解释。我们表明，Ganmex基线提高了显着性图，并在基于扰动的评估指标上对现有基线的指标提高了性能。现有的归因结果因对模型随机化不敏感而闻名，我们证明了Ganmex基准在模型的级联随机化下导致了更好的结果。

Attribution methods have been shown as promising approaches for identifying key features that led to learned model predictions. While most existing attribution methods rely on a baseline input for performing feature perturbations, limited research has been conducted to address the baseline selection issues. Poor choices of baselines limit the ability of one-vs-one (1-vs-1) explanations for multi-class classifiers, which means the attribution methods were not able to explain why an input belongs to its original class but not the other specified target class. 1-vs-1 explanation is crucial when certain classes are more similar than others, e.g. two bird types among multiple animals, by focusing on key differentiating features rather than shared features across classes. In this paper, we present GAN-based Model EXplainability (GANMEX), a novel approach applying Generative Adversarial Networks (GAN) by incorporating the to-be-explained classifier as part of the adversarial networks. Our approach effectively selects the counterfactual baseline as the closest realistic sample belong to the target class, which allows attribution methods to provide true 1-vs-1 explanations. We showed that GANMEX baselines improved the saliency maps and led to stronger performance on perturbation-based evaluation metrics over the existing baselines. Existing attribution results are known for being insensitive to model randomization, and we demonstrated that GANMEX baselines led to better outcome under the cascading randomization of the model.

下载PDF全文

下载文献需遵守相关版权规定

论文标题