具有双语语义相似性奖励的零击跨语性摘要的深度增强模型

论文标题

具有双语语义相似性奖励的零击跨语性摘要的深度增强模型

A Deep Reinforced Model for Zero-Shot Cross-Lingual Summarization with Bilingual Semantic Similarity Rewards

论文作者

Dou, Zi-Yi, Kumar, Sachin, Tsvetkov, Yulia

论文摘要

跨语性文本摘要旨在用一种语言以另一种语言给出的一种语言生成文档摘要。这实际上是一个重要但探索量不足的任务，这主要是由于缺乏可用数据。现有方法求助于机器翻译以合成培训数据，但是这种管道方法会遭受错误传播。在这项工作中，我们提出了一个端到端的跨语性文本摘要模型。该模型使用强化学习直接优化目标语言中产生的摘要和用源语言的金摘要之间的双语语义相似性度量。我们还介绍了预先培训的技术，以利用单语汇总和机器翻译目标。英语（英语和英语）的实验结果 - 德国人的跨语性摘要设置证明了我们方法的有效性。此外，我们发现具有双语语义相似性的增强学习模型，因为奖励会产生比强基础更流利的句子。

Cross-lingual text summarization aims at generating a document summary in one language given input in another language. It is a practically important but under-explored task, primarily due to the dearth of available data. Existing methods resort to machine translation to synthesize training data, but such pipeline approaches suffer from error propagation. In this work, we propose an end-to-end cross-lingual text summarization model. The model uses reinforcement learning to directly optimize a bilingual semantic similarity metric between the summaries generated in a target language and gold summaries in a source language. We also introduce techniques to pre-train the model leveraging monolingual summarization and machine translation objectives. Experimental results in both English--Chinese and English--German cross-lingual summarization settings demonstrate the effectiveness of our methods. In addition, we find that reinforcement learning models with bilingual semantic similarity as rewards generate more fluent sentences than strong baselines.

下载PDF全文

下载文献需遵守相关版权规定

论文标题