论文标题
Advaug:神经机器翻译的强大对抗性增强
AdvAug: Robust Adversarial Augmentation for Neural Machine Translation
论文作者
论文摘要
在本文中,我们提出了一种神经机器翻译(NMT)的新对抗性增强方法。主要思想是最大程度地减少从两个附近分布中采样的虚拟句子中的阴影风险,其中至关重要的句子是对对抗性句子的新型附近分布,描述了一个围绕观察到的训练对的平滑插值嵌入空间。然后,我们讨论我们的方法,即先进,以序列到序列学习的虚拟句子的嵌入来训练NMT模型。有关中文英语,英语和英国 - 德语翻译基准的实验表明,在不使用额外的公司的情况下,提高了对变压器(最多4.9个BLEU点)的明显改进(最多4.9个BLEU点),并且大大优于其他数据增强技术(例如,反跨性别)。
In this paper, we propose a new adversarial augmentation method for Neural Machine Translation (NMT). The main idea is to minimize the vicinal risk over virtual sentences sampled from two vicinity distributions, of which the crucial one is a novel vicinity distribution for adversarial sentences that describes a smooth interpolated embedding space centered around observed training sentence pairs. We then discuss our approach, AdvAug, to train NMT models using the embeddings of virtual sentences in sequence-to-sequence learning. Experiments on Chinese-English, English-French, and English-German translation benchmarks show that AdvAug achieves significant improvements over the Transformer (up to 4.9 BLEU points), and substantially outperforms other data augmentation techniques (e.g. back-translation) without using extra corpora.