文档级神经机器翻译的建模话语结构

论文标题

文档级神经机器翻译的建模话语结构

Modeling Discourse Structure for Document-level Neural Machine Translation

论文作者

Chen, Junxuan, Li, Xiang, Zhang, Jiarui, Zhou, Chulun, Cui, Jianwei, Wang, Bin, Su, Jinsong

论文摘要

最近，文档级神经机器翻译（NMT）已成为机器翻译社区中的热门话题。尽管取得了成功，但大多数现有研究都忽略了要翻译的输入文档的话语结构信息，该信息在其他任务中已经有效。在本文中，我们建议借助话语结构信息来改善文档级NMT。我们的编码器基于分层注意力网络（HAN）。具体来说，我们首先解析输入文档以获得其话语结构。然后，我们介绍了一个基于变压器的路径编码器，以嵌入每个单词的话语结构信息。最后，我们将话语结构信息与单词嵌入到编码器之前的嵌入。英语对德国数据集的实验结果表明，我们的模型可以显着胜过变压器和变压器+HAN。

Recently, document-level neural machine translation (NMT) has become a hot topic in the community of machine translation. Despite its success, most of existing studies ignored the discourse structure information of the input document to be translated, which has shown effective in other tasks. In this paper, we propose to improve document-level NMT with the aid of discourse structure information. Our encoder is based on a hierarchical attention network (HAN). Specifically, we first parse the input document to obtain its discourse structure. Then, we introduce a Transformer-based path encoder to embed the discourse structure information of each word. Finally, we combine the discourse structure information with the word embedding before it is fed into the encoder. Experimental results on the English-to-German dataset show that our model can significantly outperform both Transformer and Transformer+HAN.

下载PDF全文

下载文献需遵守相关版权规定

论文标题