部分结构化输出学习弱歧义

论文标题

部分结构化输出学习弱歧义

Weak Disambiguation for Partial Structured Output Learning

论文作者

Lu, Xiaolei, Chow, Tommy W. S.

论文摘要

局部结构化输出学习的现有歧义策略不能很好地概括以解决有些候选人可能是假阳性或与地面真相标签相似的问题。在本文中，我们提出了一种新型的部分结构化输出学习（WD-PSL）的弱歧义。首先，分段较大的边距公式被推广到部分结构化的输出学习，从而有效地避免处理大量的复杂结构候选结构化输出。其次，在拟议的弱歧义策略中，每个候选标签都具有一个置信值，表明其真实标签的可能性是多大的可能性，该标签旨在减少学习过程中错误地面真相标签分配的负面影响。然后配制了两个大边缘，以结合两种类型的约束，这是候选人和非候选者之间的歧义，以及候选人的弱歧义。在交替优化的框架中，开发了新的2N-SLACK变量切割平面算法以加速每种优化的迭代。自然语言处理的几个序列标记任务的实验结果表明了所提出的模型的有效性。

Existing disambiguation strategies for partial structured output learning just cannot generalize well to solve the problem that there are some candidates which can be false positive or similar to the ground-truth label. In this paper, we propose a novel weak disambiguation for partial structured output learning (WD-PSL). First, a piecewise large margin formulation is generalized to partial structured output learning, which effectively avoids handling large number of candidate structured outputs for complex structures. Second, in the proposed weak disambiguation strategy, each candidate label is assigned with a confidence value indicating how likely it is the true label, which aims to reduce the negative effects of wrong ground-truth label assignment in the learning process. Then two large margins are formulated to combine two types of constraints which are the disambiguation between candidates and non-candidates, and the weak disambiguation for candidates. In the framework of alternating optimization, a new 2n-slack variables cutting plane algorithm is developed to accelerate each iteration of optimization. The experimental results on several sequence labeling tasks of Natural Language Processing show the effectiveness of the proposed model.

下载PDF全文

下载文献需遵守相关版权规定

论文标题