通过横示例软智能改善深度度量学习的校准

论文标题

通过横示例软智能改善深度度量学习的校准

Improving Calibration in Deep Metric Learning With Cross-Example Softmax

论文作者

Veit, Andreas, Wilber, Kimberly

论文摘要

现代图像检索系统越来越多地依赖于深层神经网络来学习嵌入空间，距离在其中编码给定查询和图像之间的相关性。在这种情况下，现有方法倾向于强调两种属性之一。基于三胞胎的方法捕获了$ K $相关性，其中所有上$ K $评分文件都与给定的查询成对对比度模型相关，捕获阈值相关性，其中所有文档的得分都高于某些阈值是相关的。在本文中，我们提出了横示例SoftMax，将顶部$ K $的属性和阈值相关性结合在一起。在每次迭代中，提议的损失都会鼓励所有查询更接近其匹配图像，而所有查询都与所有不匹配的图像相比。这导致了全球更具校准的相似性度量，并使距离更容易解释为相关性的绝对度量。我们进一步介绍了横示例负挖掘，其中每对与整个批次的最难进行比较进行了比较。从经验上讲，我们在一系列有关概念字幕和Flickr30k的实验中表明，该建议的方法有效地改善了全局校准以及检索性能。

Modern image retrieval systems increasingly rely on the use of deep neural networks to learn embedding spaces in which distance encodes the relevance between a given query and image. In this setting, existing approaches tend to emphasize one of two properties. Triplet-based methods capture top-$k$ relevancy, where all top-$k$ scoring documents are assumed to be relevant to a given query Pairwise contrastive models capture threshold relevancy, where all documents scoring higher than some threshold are assumed to be relevant. In this paper, we propose Cross-Example Softmax which combines the properties of top-$k$ and threshold relevancy. In each iteration, the proposed loss encourages all queries to be closer to their matching images than all queries are to all non-matching images. This leads to a globally more calibrated similarity metric and makes distance more interpretable as an absolute measure of relevance. We further introduce Cross-Example Negative Mining, in which each pair is compared to the hardest negative comparisons across the entire batch. Empirically, we show in a series of experiments on Conceptual Captions and Flickr30k, that the proposed method effectively improves global calibration and also retrieval performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题