使用自动编码器进行视觉异常检测的自我监督培训

论文标题

使用自动编码器进行视觉异常检测的自我监督培训

Self-Supervised Training with Autoencoders for Visual Anomaly Detection

论文作者

Bauer, Alexander, Nakajima, Shinichi, Müller, Klaus-Robert

论文摘要

我们专注于在异常检测中的特定用例，其中正常样品的分布由较低的歧管支持。在这里，正则化自动编码器通过学习一组正常示例的身份映射来提供一种流行的方法，同时试图防止在歧管外的点上进行良好的重建。通常，通过控制模型的容量，可以直接通过在相应网络的一部分上施加一些稀疏性（或收缩）约束来实现此目标。但是，这些技术都没有明确地惩罚反常信号的重建通常会导致检测不佳。我们通过调整自我监督的学习制度来解决这个问题，该学习制度利用培训期间利用歧视性信息，但重点介绍了正常示例的子手法。非正式地，我们的培训目标使该模型定期生成本地一致的重建，同时通过充当去除异常模式的过滤器来代替不规则。为了支持这种直觉，我们对所提出的方法进行了严格的正式分析，并提供了许多有趣的见解。特别是，我们表明，所得模型类似于对未腐烂样品的子手机的非线性正交投影。另一方面，我们将正交投影识别为许多正规化自动编码器，包括收缩和降解变体的最佳解决方案。我们通过对所提出方法的结果检测和定位性能进行经验评估来支持我们的理论分析。特别是，我们在MVTEC AD数据集上实现了新的最新结果，这是制造域中视觉异常检测的具有挑战性的基准。

We focus on a specific use case in anomaly detection where the distribution of normal samples is supported by a lower-dimensional manifold. Here, regularized autoencoders provide a popular approach by learning the identity mapping on the set of normal examples, while trying to prevent good reconstruction on points outside of the manifold. Typically, this goal is implemented by controlling the capacity of the model, either directly by reducing the size of the bottleneck layer or implicitly by imposing some sparsity (or contraction) constraints on parts of the corresponding network. However, neither of these techniques does explicitly penalize the reconstruction of anomalous signals often resulting in poor detection. We tackle this problem by adapting a self-supervised learning regime that exploits discriminative information during training but focuses on the submanifold of normal examples. Informally, our training objective regularizes the model to produce locally consistent reconstructions, while replacing irregularities by acting as a filter that removes anomalous patterns. To support this intuition, we perform a rigorous formal analysis of the proposed method and provide a number of interesting insights. In particular, we show that the resulting model resembles a non-linear orthogonal projection of partially corrupted images onto the submanifold of uncorrupted samples. On the other hand, we identify the orthogonal projection as an optimal solution for a number of regularized autoencoders including the contractive and denoising variants. We support our theoretical analysis by empirical evaluation of the resulting detection and localization performance of the proposed method. In particular, we achieve a new state-of-the-art result on the MVTec AD dataset -- a challenging benchmark for visual anomaly detection in the manufacturing domain.

下载PDF全文

下载文献需遵守相关版权规定

论文标题