通过合作推理诱导语言模型解决数学单词问题

论文标题

通过合作推理诱导语言模型解决数学单词问题

Solving Math Word Problems via Cooperative Reasoning induced Language Models

论文作者

Zhu, Xinyu, Wang, Junjie, Zhang, Lin, Zhang, Yuxiang, Gan, Ruyi, Zhang, Jiaxing, Yang, Yujiu

论文摘要

大规模训练的语言模型（PLM）为具有挑战性的问题带来了新的机会，尤其是那些需要高级智能的问题，例如数学单词问题（MWPS）。但是，将现有的PLM直接应用于MWP可能会失败，因为生成过程缺乏足够的监督，因此缺乏像人类一样快速适应性的。我们注意到，人类推理具有双重推理框架，该框架由直接反应系统（系统1）和微妙的推理系统（系统2）组成，其中整个推理取决于它们的相互作用。这激发了我们开发一种合作推理引起的PLM来解决MWP，称为合作推理（CORE），从而导致具有系统1作为生成器和系统2作为验证者的类似人类的推理体系结构。在我们的方法中，发电机负责生成推理路径，并且验证者用于监督评估，以便获得生成器的可靠反馈。我们在几个数学推理数据集上评估了我们的核心框架，并对最先进的方法进行了体面的改进，比最佳基线增长了9.6％。我们的代码可从https://github.com/tianhongzxy/core获得

Large-scale pre-trained language models (PLMs) bring new opportunities to challenging problems, especially those that need high-level intelligence, such as the math word problem (MWPs). However, directly applying existing PLMs to MWPs can fail as the generation process lacks sufficient supervision and thus lacks fast adaptivity as humans. We notice that human reasoning has a dual reasoning framework that consists of an immediate reaction system (system 1) and a delicate reasoning system (system 2), where the entire reasoning is determined by their interaction. This inspires us to develop a cooperative reasoning-induced PLM for solving MWPs, called Cooperative Reasoning (CoRe), resulting in a human-like reasoning architecture with system 1 as the generator and system 2 as the verifier. In our approach, the generator is responsible for generating reasoning paths, and the verifiers are used to supervise the evaluation in order to obtain reliable feedback for the generator. We evaluate our CoRe framework on several mathematical reasoning datasets and achieve decent improvement over state-of-the-art methods, up to 9.6% increase over best baselines. Our codes are available at https://github.com/TianHongZXY/CoRe

下载PDF全文

下载文献需遵守相关版权规定

论文标题