论文标题
dictdis:词典有限的歧义,以改善NMT
DICTDIS: Dictionary Constrained Disambiguation for Improved NMT
论文作者
论文摘要
特定领域的神经机器翻译(NMT)系统(例如,在教育应用中)具有重要的社会意义,有可能帮助使多语言社会中各种用户访问信息。希望这样的NMT系统受到词汇的约束,并从域特异性词典中汲取。由于单词的多义性质,词典可以为源单词/短语提供多个候选翻译。然后,在NMT模型上,责任选择上下文最合适的候选人。先前的工作在很大程度上忽略了这个问题,并专注于单个候选约束设置,其中目标词或短语被单个约束所取代。在这项工作中,我们提出了DICSDI,这是一种词汇约束的NMT系统,它在词典中得出的多个候选翻译之间存在歧义。我们通过使用多个词典候选者来增强培训数据来实现这一目标,从而通过隐式对齐多个候选约束来积极鼓励训练期间的歧义。我们通过对各种领域的英语印地语和英语 - 德国句子进行广泛的实验来证明DICTDIS的实用性,包括监管,金融,工程。我们还对标准基准测试数据集进行了比较。与现有的词汇约束和不受限制的NMT的方法相比,我们在所有域上的约束副本和歧义相关措施方面都表现出了卓越的性能,同时还获得了一些在某些域中最高2-3个BLEU点的流利度。
Domain-specific neural machine translation (NMT) systems (e.g., in educational applications) are socially significant with the potential to help make information accessible to a diverse set of users in multilingual societies. It is desirable that such NMT systems be lexically constrained and draw from domain-specific dictionaries. Dictionaries could present multiple candidate translations for a source word/phrase due to the polysemous nature of words. The onus is then on the NMT model to choose the contextually most appropriate candidate. Prior work has largely ignored this problem and focused on the single candidate constraint setting wherein the target word or phrase is replaced by a single constraint. In this work we present DictDis, a lexically constrained NMT system that disambiguates between multiple candidate translations derived from dictionaries. We achieve this by augmenting training data with multiple dictionary candidates to actively encourage disambiguation during training by implicitly aligning multiple candidate constraints. We demonstrate the utility of DictDis via extensive experiments on English-Hindi and English-German sentences in a variety of domains including regulatory, finance, engineering. We also present comparisons on standard benchmark test datasets. In comparison with existing approaches for lexically constrained and unconstrained NMT, we demonstrate superior performance with respect to constraint copy and disambiguation related measures on all domains while also obtaining improved fluency of up to 2-3 BLEU points on some domains.