平衡积极监督学习的偏见和差异

论文标题

平衡积极监督学习的偏见和差异

Balancing Bias and Variance for Active Weakly Supervised Learning

论文作者

Sapkota, Hitesh, Yu, Qi

论文摘要

作为一种广泛使用的弱监督学习计划，现代多重实例学习（MIL）模型在袋子级别上实现了竞争性能。但是，实例级别的预测对于许多重要的应用至关重要，在很大程度上仍然不令人满意。我们建议进行新型的活跃的深层实例学习，以对注释的一小部分信息实例进行样本，以显着提高实例级别的预测。差异正规损耗函数旨在正确地平衡实例级预测的偏差和差异，旨在有效地适应MIL和其他基本挑战中高度不平衡的实例分布。我们不是直接最大程度地减少非凸的正规损失，而是优化了分布稳健的袋子水平的可能性作为其凸代替代物。强大的袋子的可能性可以很好地近似基于方差的MIL损失，并具有强大的理论保证。它也会自动平衡偏见和差异，从而有效地确定支持主动采样的潜在积极实例。强大的袋子可能性可以自然地与深度建筑一起使用，以使用小批量的正面阴性袋对支持深层的模型训练。最后，开发了一种新型的P-F采样函数，该功能结合了概率向量和预测实例分数，通过优化健壮的袋子可能性获得。通过利用关键的MIL假设，采样函数可以探索最具挑战性的袋子，并有效地检测其积极的注释实例，从而显着改善了实例级别的预测。通过多个现实世界数据集进行的实验清楚地证明了该模型实现的最新实例级别的预测。

As a widely used weakly supervised learning scheme, modern multiple instance learning (MIL) models achieve competitive performance at the bag level. However, instance-level prediction, which is essential for many important applications, remains largely unsatisfactory. We propose to conduct novel active deep multiple instance learning that samples a small subset of informative instances for annotation, aiming to significantly boost the instance-level prediction. A variance regularized loss function is designed to properly balance the bias and variance of instance-level predictions, aiming to effectively accommodate the highly imbalanced instance distribution in MIL and other fundamental challenges. Instead of directly minimizing the variance regularized loss that is non-convex, we optimize a distributionally robust bag level likelihood as its convex surrogate. The robust bag likelihood provides a good approximation of the variance based MIL loss with a strong theoretical guarantee. It also automatically balances bias and variance, making it effective to identify the potentially positive instances to support active sampling. The robust bag likelihood can be naturally integrated with a deep architecture to support deep model training using mini-batches of positive-negative bag pairs. Finally, a novel P-F sampling function is developed that combines a probability vector and predicted instance scores, obtained by optimizing the robust bag likelihood. By leveraging the key MIL assumption, the sampling function can explore the most challenging bags and effectively detect their positive instances for annotation, which significantly improves the instance-level prediction. Experiments conducted over multiple real-world datasets clearly demonstrate the state-of-the-art instance-level prediction achieved by the proposed model.

下载PDF全文

下载文献需遵守相关版权规定

论文标题