使用多个预训练的模型提高分布式检测

论文标题

使用多个预训练的模型提高分布式检测

Boosting Out-of-Distribution Detection with Multiple Pre-trained Models

论文作者

Xue, Feng, He, Zi, Xie, Chuanlong, Tan, Falong, Li, Zhenguo

论文摘要

分布式（OOD）检测，即确定是否从训练分布以外的新颖分布中取样输入，这是在开放世界中安全部署机器学习系统的关键任务。最近，利用预训练模型的事后检测已显示出令人鼓舞的性能，可以缩放到大规模的问题。这一进步提出了一个自然的问题：我们能否利用多个预训练模型的多样性来提高事后检测方法的性能？在这项工作中，我们通过结合从预训练模型的动物园得出的多个检测决策来提出一种检测增强方法。我们的方法使用P值代替常用的硬阈值，并利用多个假设测试的基本框架来控制分布（ID）数据的真正正率。我们专注于模型动物园的使用，并与当前有关各种OOD检测基准的最新方法进行系统的经验比较。与单模检测器相比，提出的整体方案显示出一致的改进，并且显着优于当前竞争方法。我们的方法在CIFAR10和IMAGENET基准方面将相对性能显着提高了65.40％和26.96％。

Out-of-Distribution (OOD) detection, i.e., identifying whether an input is sampled from a novel distribution other than the training distribution, is a critical task for safely deploying machine learning systems in the open world. Recently, post hoc detection utilizing pre-trained models has shown promising performance and can be scaled to large-scale problems. This advance raises a natural question: Can we leverage the diversity of multiple pre-trained models to improve the performance of post hoc detection methods? In this work, we propose a detection enhancement method by ensembling multiple detection decisions derived from a zoo of pre-trained models. Our approach uses the p-value instead of the commonly used hard threshold and leverages a fundamental framework of multiple hypothesis testing to control the true positive rate of In-Distribution (ID) data. We focus on the usage of model zoos and provide systematic empirical comparisons with current state-of-the-art methods on various OOD detection benchmarks. The proposed ensemble scheme shows consistent improvement compared to single-model detectors and significantly outperforms the current competitive methods. Our method substantially improves the relative performance by 65.40% and 26.96% on the CIFAR10 and ImageNet benchmarks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题