论文标题
棕榈:预先培训上下文条件生成的自动编码和自回归语言模型
PALM: Pre-training an Autoencoding&Autoregressive Language Model for Context-conditioned Generation
论文作者
论文摘要
自我监督的预训练,例如伯特,弥撒和巴特,已经成为一种自然语言理解和产生的强大技术。现有的预训练技术采用自动编码和/或自回归目标来训练基于变形金刚的模型,并用一些掩盖的令牌从损坏的文本中恢复原始单词令牌。现有技术的培训目标通常与许多语言生成任务的目标(例如生成问题回答和对话响应生成)的目标不一致。 这项工作为Palm提供了一种新颖的方案,该方案共同预先训练了大型未标记语料库上的自动编码和自回归语言模型,该模型专为生成以上下文调节的新文本而设计。新方案减轻了现有的deoising计划在预训练和微调之间引入的不匹配,而发电量不仅仅是重建原始文本。一组广泛的实验表明,Palm在各种语言生成基准上取得了新的最先进的结果,这些基准涵盖了生成问题答案(在Marco官方排行榜上排名第1),CNN/DailyMail上的抽象性摘要以及Gigaword,Gigaword,Question Generation,on Squead上的问题以及康奈尔电影对话的对话回答。
Self-supervised pre-training, such as BERT, MASS and BART, has emerged as a powerful technique for natural language understanding and generation. Existing pre-training techniques employ autoencoding and/or autoregressive objectives to train Transformer-based models by recovering original word tokens from corrupted text with some masked tokens. The training goals of existing techniques are often inconsistent with the goals of many language generation tasks, such as generative question answering and conversational response generation, for producing new text given context. This work presents PALM with a novel scheme that jointly pre-trains an autoencoding and autoregressive language model on a large unlabeled corpus, specifically designed for generating new text conditioned on context. The new scheme alleviates the mismatch introduced by the existing denoising scheme between pre-training and fine-tuning where generation is more than reconstructing original text. An extensive set of experiments show that PALM achieves new state-of-the-art results on a variety of language generation benchmarks covering generative question answering (Rank 1 on the official MARCO leaderboard), abstractive summarization on CNN/DailyMail as well as Gigaword, question generation on SQuAD, and conversational response generation on Cornell Movie Dialogues.