我们应该在哪个级别提取？关于提取文档摘要的经验分析

论文标题

我们应该在哪个级别提取？关于提取文档摘要的经验分析

At Which Level Should We Extract? An Empirical Analysis on Extractive Document Summarization

论文作者

Zhou, Qingyu, Wei, Furu, Zhou, Ming

论文摘要

提取方法已被证明在自动文档摘要中有效。以前的工作通过在句子级别识别信息内容来执行此任务。但是，尚不清楚在句子级别进行提取是最好的解决方案。在这项工作中，我们表明，提取完整的句子并提取次级单位是一种有前途的替代方案。具体而言，我们建议根据选区解析树提取子句子单位。提出了一种利用亚句子信息并提取它们的神经提取模型。广泛的实验和分析表明，提取子句子单元在自动评估和人类评估的评估下进行竞争性地进行了竞争性比较。希望我们的工作能够为未来研究的提取性摘要提供一些基本提取单元的灵感。

Extractive methods have been proven effective in automatic document summarization. Previous works perform this task by identifying informative contents at sentence level. However, it is unclear whether performing extraction at sentence level is the best solution. In this work, we show that unnecessity and redundancy issues exist when extracting full sentences, and extracting sub-sentential units is a promising alternative. Specifically, we propose extracting sub-sentential units based on the constituency parsing tree. A neural extractive model which leverages the sub-sentential information and extracts them is presented. Extensive experiments and analyses show that extracting sub-sentential units performs competitively comparing to full sentence extraction under the evaluation of both automatic and human evaluations. Hopefully, our work could provide some inspiration of the basic extraction units in extractive summarization for future research.

下载PDF全文

下载文献需遵守相关版权规定

论文标题