Insrl：一个多视图学习框架，融合了多个信息源，以供遥远的关系提取

论文标题

Insrl：一个多视图学习框架，融合了多个信息源，以供遥远的关系提取

InSRL: A Multi-view Learning Framework Fusing Multiple Information Sources for Distantly-supervised Relation Extraction

论文作者

Chu, Zhendong, Jiang, Haiyun, Xiao, Yanghua, Wang, Wei

论文摘要

遥远的监督使可以通过利用知识库自动标记句子袋以提取的句子，但遭受了稀疏和嘈杂的袋子问题的影响。迫切需要其他信息来源来补充培训数据并克服这些问题。在本文中，我们在知识库中介绍了两个广泛存在的来源，即实体描述，以及多元性实体类型，以丰富遥远的监督数据。我们将信息源视为多种视图，并将它们融合在一起，并构建了一个完整的空间以及足够的信息。提出了通过完整的空间表示学习（INSRL）提出的端到端多视图学习框架，同时共同学习单个视图的表示。此外，内部视图和跨视图注意机制用于在实体对基础上突出有关不同级别的重要信息。流行基准数据集的实验结果证明了其他信息源的必要性以及我们框架的有效性。在匿名审核阶段之后，我们将发布具有多个信息源的模型和数据集的实现。

Distant supervision makes it possible to automatically label bags of sentences for relation extraction by leveraging knowledge bases, but suffers from the sparse and noisy bag issues. Additional information sources are urgently needed to supplement the training data and overcome these issues. In this paper, we introduce two widely-existing sources in knowledge bases, namely entity descriptions, and multi-grained entity types to enrich the distantly supervised data. We see information sources as multiple views and fusing them to construct an intact space with sufficient information. An end-to-end multi-view learning framework is proposed for relation extraction via Intact Space Representation Learning (InSRL), and the representations of single views are jointly learned simultaneously. Moreover, inner-view and cross-view attention mechanisms are used to highlight important information on different levels on an entity-pair basis. The experimental results on a popular benchmark dataset demonstrate the necessity of additional information sources and the effectiveness of our framework. We will release the implementation of our model and dataset with multiple information sources after the anonymized review phase.

下载PDF全文

下载文献需遵守相关版权规定

论文标题