论文标题

与什么意义形式相关性必须与

What Meaning-Form Correlation Has to Compose With

论文作者

Mickus, Timothee, Bernard, Timothée, Paperno, Denis

论文摘要

构图是自然语言的广泛讨论的特性,尽管其确切的定义是难以捉摸的。我们专注于可以通过衡量含义形式相关性来评估组成性的建议。我们分析了三组语言的含义形式相关性:(i)量身定制为作曲的人造玩具语言,(ii)一组英语词典定义,以及(iii)一组来自文献的英语句子。我们发现,语言现象(例如同义词和未接地的停止词)对MFC的测量进行了权衡,并且根据应用程序的数据集的不同,减轻其效果的直接方法的结果很大。数据和代码可公开可用。

Compositionality is a widely discussed property of natural languages, although its exact definition has been elusive. We focus on the proposal that compositionality can be assessed by measuring meaning-form correlation. We analyze meaning-form correlation on three sets of languages: (i) artificial toy languages tailored to be compositional, (ii) a set of English dictionary definitions, and (iii) a set of English sentences drawn from literature. We find that linguistic phenomena such as synonymy and ungrounded stop-words weigh on MFC measurements, and that straightforward methods to mitigate their effects have widely varying results depending on the dataset they are applied to. Data and code are made publicly available.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源