论文标题
自然推理模型是否有用?学习暗示和预设
Are Natural Language Inference Models IMPPRESsive? Learning IMPlicature and PRESupposition
论文作者
论文摘要
自然语言推论(NLI)是自然语言理解的越来越重要的任务,这需要一个人推断句子是否需要另一个句子。但是,NLI模型做出务实推论的能力仍在研究。我们创建了一个暗示和预性诊断数据集(IMPPRES),由> 25k的半自动化生成的句子对组成,说明了研究良好的务实推理类型。我们使用IMPPRE来评估在Multinli接受培训的Bert,Infersent和Bow NLI模型(Williams等,2018)是否学会进行务实的推论。尽管Multinli似乎包含很少有说明这些推理类型的对,但我们发现Bert学会了绘制务实的推论。它可靠地处理由“某些”触发的标量暗示。对于某些前提触发器,例如“仅”,伯特可靠地将预设视为零件,即使将触发器嵌入到零件取消的操作员(如否定)之下。弓箭表现出务实推理的证据较弱。我们得出的结论是,NLI培训鼓励模型学习一些但不是全部务实的推论。
Natural language inference (NLI) is an increasingly important task for natural language understanding, which requires one to infer whether a sentence entails another. However, the ability of NLI models to make pragmatic inferences remains understudied. We create an IMPlicature and PRESupposition diagnostic dataset (IMPPRES), consisting of >25k semiautomatically generated sentence pairs illustrating well-studied pragmatic inference types. We use IMPPRES to evaluate whether BERT, InferSent, and BOW NLI models trained on MultiNLI (Williams et al., 2018) learn to make pragmatic inferences. Although MultiNLI appears to contain very few pairs illustrating these inference types, we find that BERT learns to draw pragmatic inferences. It reliably treats scalar implicatures triggered by "some" as entailments. For some presupposition triggers like "only", BERT reliably recognizes the presupposition as an entailment, even when the trigger is embedded under an entailment canceling operator like negation. BOW and InferSent show weaker evidence of pragmatic reasoning. We conclude that NLI training encourages models to learn some, but not all, pragmatic inferences.