KPT：扎根对话框的关键字引导的预培训

论文标题

KPT：扎根对话框的关键字引导的预培训

KPT: Keyword-guided Pre-training for Grounded Dialog Generation

论文作者

Zhu, Qi, Mi, Fei, Zhang, Zheng, Wang, Yasheng, Li, Yitong, Jiang, Xin, Liu, Qun, Zhu, Xiaoyan, Huang, Minlie

论文摘要

将外部知识纳入响应生成过程对于建立更多有用和可靠的对话代理至关重要。但是，收集知识的对话通常是昂贵的，要求对接地对话生成进行更好的预训练模型，从而推广W.R.T.不同类型的知识。在这项工作中，我们提出了KPT（关键字引导的预训练），这是一种新颖的自我监管的预训练方法，用于扎根对话的生成，而无需依赖额外的知识注释。具体而言，我们使用预训练的语言模型来将对话框中的最不确定的令牌作为关键字提取。借助这些关键字，我们构建了两种知识，并预先培训了知识基础的响应生成模型，旨在处理两种不同的情况：（1）应忠实地将知识扎根；（2）可以选择性地使用它。对于前者，基础知识由从响应中提取的关键字组成。对于后者而言，从同一对话框中的其他话语中提取的关键字还可以增强基础知识。由于知识是从对话框中提取的，因此可以在大量和各种对话数据上轻松执行KPT。我们考虑了三个数据源（开放域，以任务为导向的，对话质量质量质量检查），总计为250万个对话。我们对各种少数知识的生成任务进行了广泛的实验，包括基于对话行为，知识图，角色描述和维基百科段落。我们的全面实验和分析表明，KPT在具有不同的基础知识的这些任务上始终优于最先进的方法。

Incorporating external knowledge into the response generation process is essential to building more helpful and reliable dialog agents. However, collecting knowledge-grounded conversations is often costly, calling for a better pre-trained model for grounded dialog generation that generalizes well w.r.t. different types of knowledge. In this work, we propose KPT (Keyword-guided Pre-Training), a novel self-supervised pre-training method for grounded dialog generation without relying on extra knowledge annotation. Specifically, we use a pre-trained language model to extract the most uncertain tokens in the dialog as keywords. With these keywords, we construct two kinds of knowledge and pre-train a knowledge-grounded response generation model, aiming at handling two different scenarios: (1) the knowledge should be faithfully grounded; (2) it can be selectively used. For the former, the grounding knowledge consists of keywords extracted from the response. For the latter, the grounding knowledge is additionally augmented with keywords extracted from other utterances in the same dialog. Since the knowledge is extracted from the dialog itself, KPT can be easily performed on a large volume and variety of dialogue data. We considered three data sources (open-domain, task-oriented, conversational QA) with a total of 2.5M dialogues. We conduct extensive experiments on various few-shot knowledge-grounded generation tasks, including grounding on dialog acts, knowledge graphs, persona descriptions, and Wikipedia passages. Our comprehensive experiments and analyses demonstrate that KPT consistently outperforms state-of-the-art methods on these tasks with diverse grounding knowledge.

下载PDF全文

下载文献需遵守相关版权规定

论文标题