论文标题
跨域视频异常检测无目标域的适应性
Cross-Domain Video Anomaly Detection without Target Domain Adaptation
论文作者
论文摘要
大多数跨域无监督的视频异常检测(VAD)工作假设至少很少有与任务相关的目标域训练数据可用于从源到目标域的适应。但是,这需要最终用户进行艰苦的模型调整,他们可能更喜欢拥有一个有效的系统``开箱即用了。设置与以前的未来预测模型不同,我们的模型使用新颖的常态分类器模块来学习正常事件视频的功能,通过学习这些功能与伪abnormal示例中的特征不同。一个新型未经训练的卷积神经网络基于异常合成模块通过在普通视频框架中添加异物而没有额外的训练成本来制作这些伪 - 卑鄙的示例。通过我们新颖的相对正常特征学习策略,ZXVAD概括并学会了在新的目标域中不适合推断的新目标域中的正常框架和异常框架。通过对通用数据集的评估,我们表明ZXVAD的表现要优于最先进的(SOTA),无论任务与任务相关(即VAD)源培训数据是否可用。最后,ZXVAD还击败了推理时间效率指标中的SOTA方法,包括模型大小,总参数,GPU能耗和GMAC。
Most cross-domain unsupervised Video Anomaly Detection (VAD) works assume that at least few task-relevant target domain training data are available for adaptation from the source to the target domain. However, this requires laborious model-tuning by the end-user who may prefer to have a system that works ``out-of-the-box." To address such practical scenarios, we identify a novel target domain (inference-time) VAD task where no target domain training data are available. To this end, we propose a new `Zero-shot Cross-domain Video Anomaly Detection (zxvad)' framework that includes a future-frame prediction generative model setup. Different from prior future-frame prediction models, our model uses a novel Normalcy Classifier module to learn the features of normal event videos by learning how such features are different ``relatively" to features in pseudo-abnormal examples. A novel Untrained Convolutional Neural Network based Anomaly Synthesis module crafts these pseudo-abnormal examples by adding foreign objects in normal video frames with no extra training cost. With our novel relative normalcy feature learning strategy, zxvad generalizes and learns to distinguish between normal and abnormal frames in a new target domain without adaptation during inference. Through evaluations on common datasets, we show that zxvad outperforms the state-of-the-art (SOTA), regardless of whether task-relevant (i.e., VAD) source training data are available or not. Lastly, zxvad also beats the SOTA methods in inference-time efficiency metrics including the model size, total parameters, GPU energy consumption, and GMACs.