论文标题

H-fnd:分层假阴性denoinging用于遥远的监督关系提取

H-FND: Hierarchical False-Negative Denoising for Distant Supervision Relation Extraction

论文作者

Chen, Jhih-Wei, Fu, Tsu-Jui, Lee, Chen-Kang, Ma, Wei-Yun

论文摘要

尽管遥远的监督会自动生成培训数据以进行关系提取,但它还引入了虚假阳性(FP)和假阴性(FN)培训实例,向生成的数据集引入。尽管这两种类型的错误都会降低最终模型性能,但先前关于远处的监督的工作更多地侧重于抑制FP噪声,而较少地降低了解决FN问题。我们在这里提出了H-FND,这是一个层次的假阴性denoising框架,用于稳定的远距离指导提取,作为FN DeNoising解决方案。 H-FND使用层次结构政策,该政策首先确定是否应在培训过程中保留,丢弃或修订非关系(NA)实例。对于要修改的学习实例,该政策进一步将其重新分配给了他们适当的关系,从而使他们更好地培训意见。使用受控的FN比率进行了Semeval-2010和Tacred的实验,将训练和验证实例的关系随机转化为负面,以生成FN实例。在这种情况下,H-FND可以正确修改FN实例,即使50%的实例已将其变成负面影响,也可以保持较高的F1分数。进一步进行了NYT10上的实验,以表明H-FND适用于现实环境。

Although distant supervision automatically generates training data for relation extraction, it also introduces false-positive (FP) and false-negative (FN) training instances to the generated datasets. Whereas both types of errors degrade the final model performance, previous work on distant supervision denoising focuses more on suppressing FP noise and less on resolving the FN problem. We here propose H-FND, a hierarchical false-negative denoising framework for robust distant supervision relation extraction, as an FN denoising solution. H-FND uses a hierarchical policy which first determines whether non-relation (NA) instances should be kept, discarded, or revised during the training process. For those learning instances which are to be revised, the policy further reassigns them appropriate relations, making them better training inputs. Experiments on SemEval-2010 and TACRED were conducted with controlled FN ratios that randomly turn the relations of training and validation instances into negatives to generate FN instances. In this setting, H-FND can revise FN instances correctly and maintains high F1 scores even when 50% of the instances have been turned into negatives. Experiment on NYT10 is further conducted to shows that H-FND is applicable in a realistic setting.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源