词汇语义增强神经词嵌入

论文标题

词汇语义增强神经词嵌入

Lexical semantics enhanced neural word embeddings

论文作者

Yang, Dongqiang, Li, Ning, Zou, Li, Ma, Hongwei

论文摘要

自然语言处理的当前突破已从神经语言模型中受益匪浅，通过这些模型，分布语义可以利用神经数据表示来促进下游应用。由于神经嵌入使用单词共发生的上下文预测来产生密集的向量，因此它们不可避免地容易捕获更多的语义关联，而不是语义相似性。为了改善矢量空间模型在得出语义相似性时，我们通过深度度量学习后进行了后处理神经词嵌入，我们可以通过这些学习将词汇语义关系（包括Syn/Antonymy和Antony和Hypo/Hypernymy）注入分布空间。我们介绍了层次结构拟合，这是一种新颖的语义专业方法，用于建模语义相似性固有地存储在IS-A层次结构中。层次结构拟合可以在共同和稀有的基准数据集上获得最新的结果，以从神经词嵌入中得出语义相似性。它还结合了一个不对称距离函数，以明确地将Hypernymy的方向性提高，从而在多个评估任务中显着改善了香草嵌入，以检测超脑和方向性，而不会对语义相似性判断产生负面影响。结果表明，层次结构拟合在融合后期具有语义关系的神经嵌入中的功效，有可能扩大其适用于汇总异质数据和各种知识资源的适用性，以学习多模式语义空间。

Current breakthroughs in natural language processing have benefited dramatically from neural language models, through which distributional semantics can leverage neural data representations to facilitate downstream applications. Since neural embeddings use context prediction on word co-occurrences to yield dense vectors, they are inevitably prone to capture more semantic association than semantic similarity. To improve vector space models in deriving semantic similarity, we post-process neural word embeddings through deep metric learning, through which we can inject lexical-semantic relations, including syn/antonymy and hypo/hypernymy, into a distributional space. We introduce hierarchy-fitting, a novel semantic specialization approach to modelling semantic similarity nuances inherently stored in the IS-A hierarchies. Hierarchy-fitting attains state-of-the-art results on the common- and rare-word benchmark datasets for deriving semantic similarity from neural word embeddings. It also incorporates an asymmetric distance function to specialize hypernymy's directionality explicitly, through which it significantly improves vanilla embeddings in multiple evaluation tasks of detecting hypernymy and directionality without negative impacts on semantic similarity judgement. The results demonstrate the efficacy of hierarchy-fitting in specializing neural embeddings with semantic relations in late fusion, potentially expanding its applicability to aggregating heterogeneous data and various knowledge resources for learning multimodal semantic spaces.

下载PDF全文

下载文献需遵守相关版权规定

论文标题