论文标题
自动:通过好奇心引导的搜索和自我模拟学习,自动离群值检测
AutoOD: Automated Outlier Detection via Curiosity-guided Search and Self-imitation Learning
论文作者
论文摘要
离群值检测是一项重要的数据挖掘任务,具有许多实际应用,例如入侵检测,信用卡欺诈检测和视频监视。但是,鉴于具有大数据的特定复杂任务,建立强大的基于深度学习的系统以进行异常检测的过程仍然高度依赖人类的专业知识和劳动试验。尽管神经体系结构搜索(NAS)在发现各个领域的有效深度体系结构(例如图像分类,对象检测和语义分割)方面表现出了希望,但由于缺乏固有的搜索空间,不稳定的搜索过程和降低样品效率,现代的NAS方法不适合异常检测。为了弥合差距,在本文中,我们提出了自动离群检测框架Autood,该框架旨在在预定义的搜索空间内搜索最佳的神经网络模型。具体而言,我们首先设计了一个好奇的引导搜索策略,以克服当地最优性的诅咒。充当搜索代理的控制器,鼓励采取行动以最大程度地提高有关控制器内部信念的信息。我们进一步介绍了一种基于自我象征学习的经验重播机制,以提高样本效率。各种现实世界基准数据集的实验结果表明,与现有手工制作的模型和传统搜索方法相比,通过自动识别的深层模型可实现最佳性能。
Outlier detection is an important data mining task with numerous practical applications such as intrusion detection, credit card fraud detection, and video surveillance. However, given a specific complicated task with big data, the process of building a powerful deep learning based system for outlier detection still highly relies on human expertise and laboring trials. Although Neural Architecture Search (NAS) has shown its promise in discovering effective deep architectures in various domains, such as image classification, object detection, and semantic segmentation, contemporary NAS methods are not suitable for outlier detection due to the lack of intrinsic search space, unstable search process, and low sample efficiency. To bridge the gap, in this paper, we propose AutoOD, an automated outlier detection framework, which aims to search for an optimal neural network model within a predefined search space. Specifically, we firstly design a curiosity-guided search strategy to overcome the curse of local optimality. A controller, which acts as a search agent, is encouraged to take actions to maximize the information gain about the controller's internal belief. We further introduce an experience replay mechanism based on self-imitation learning to improve the sample efficiency. Experimental results on various real-world benchmark datasets demonstrate that the deep model identified by AutoOD achieves the best performance, comparing with existing handcrafted models and traditional search methods.