通过特征互动检测生成有关文本分类的层次解释

论文标题

通过特征互动检测生成有关文本分类的层次解释

Generating Hierarchical Explanations on Text Classification via Feature Interaction Detection

论文作者

Chen, Hanjie, Zheng, Guangtao, Ji, Yangfeng

论文摘要

在可靠性和可信赖性方面，对神经网络的解释对于其在现实世界中的应用至关重要。在自然语言处理中，现有方法通常提供重要特征，即从输入文本中选择的单词或短语作为解释，但忽略了它们之间的相互作用。它为人类解释解释并将其与模型预测联系起来构成了挑战。在这项工作中，我们通过检测特征交互来构建层次解释。这种解释可视化单词和短语如何在层次结构的不同级别组合在一起，这可以帮助用户了解黑盒模型的决策。通过自动和人类评估，使用两个基准数据集上的三个神经文本分类器（LSTM，CNN和BERT）评估所提出的方法。实验表明了所提出的方法在提供既忠于模型又可以解释人类的解释方面的有效性。

Generating explanations for neural networks has become crucial for their applications in real-world with respect to reliability and trustworthiness. In natural language processing, existing methods usually provide important features which are words or phrases selected from an input text as an explanation, but ignore the interactions between them. It poses challenges for humans to interpret an explanation and connect it to model prediction. In this work, we build hierarchical explanations by detecting feature interactions. Such explanations visualize how words and phrases are combined at different levels of the hierarchy, which can help users understand the decision-making of black-box models. The proposed method is evaluated with three neural text classifiers (LSTM, CNN, and BERT) on two benchmark datasets, via both automatic and human evaluations. Experiments show the effectiveness of the proposed method in providing explanations that are both faithful to models and interpretable to humans.

下载PDF全文

下载文献需遵守相关版权规定

论文标题