刮擦，剪切，粘贴和学习：应用于包裹物流的自动化数据集生成

论文标题

刮擦，剪切，粘贴和学习：应用于包裹物流的自动化数据集生成

Scrape, Cut, Paste and Learn: Automated Dataset Generation Applied to Parcel Logistics

论文作者

Naumann, Alexander, Hertlein, Felix, Zhou, Benchun, Dörr, Laura, Furmans, Kai

论文摘要

计算机视觉中的最新方法在很大程度上依赖足够大的培训数据集。对于实际应用程序，获得此类数据集通常是一项繁琐的任务。在本文中，我们提出了一条完全自动化的管道，以分解四个步骤，例如分割。与现有工作相反，我们的管道涵盖了从数据采集到最终数据集的每个步骤。我们首先从流行的图像搜索引擎中刮擦感兴趣的对象，并且由于我们仅依靠基于文本的查询，因此所得数据包含各种图像。因此，必须选择图像作为第二步。这种图像刮擦和选择的方法放松了对必须公开可用或为此目的创建的针对现实领域特定数据集的需求。我们采用对象不合时宜的背景删除模型，并比较三种不同的图像选择方法：对象无关预处理，手动图像选择和基于CNN的图像选择。在第三步中，我们在任意背景下生成了感兴趣的对象和干扰因素的随机安排。最后，图像的组成是通过使用四种不同的混合方法粘贴对象来完成的。我们通过考虑包裹分段为我们的数据集生成方法提供了一个案例研究。对于评估，我们创建了一个自动注释的包裹照片数据集。我们发现（1）我们的数据集生成管道允许成功地传输到真实的测试图像（Mask AP 86.2），（2）与人类直觉相比，非常准确的图像选择过程 - 与人类直觉相反 - 不是至关重要的，更广泛的类别定义可以帮助桥接域间隙，（3）混合方法的用法与简单的仿真相比是有益的。我们制作了完整的代码，用于刮擦，图像组成和培训，可在https://a-nau.github.io/parcel2d上公开提供。

State-of-the-art approaches in computer vision heavily rely on sufficiently large training datasets. For real-world applications, obtaining such a dataset is usually a tedious task. In this paper, we present a fully automated pipeline to generate a synthetic dataset for instance segmentation in four steps. In contrast to existing work, our pipeline covers every step from data acquisition to the final dataset. We first scrape images for the objects of interest from popular image search engines and since we rely only on text-based queries the resulting data comprises a wide variety of images. Hence, image selection is necessary as a second step. This approach of image scraping and selection relaxes the need for a real-world domain-specific dataset that must be either publicly available or created for this purpose. We employ an object-agnostic background removal model and compare three different methods for image selection: Object-agnostic pre-processing, manual image selection and CNN-based image selection. In the third step, we generate random arrangements of the object of interest and distractors on arbitrary backgrounds. Finally, the composition of the images is done by pasting the objects using four different blending methods. We present a case study for our dataset generation approach by considering parcel segmentation. For the evaluation we created a dataset of parcel photos that were annotated automatically. We find that (1) our dataset generation pipeline allows a successful transfer to real test images (Mask AP 86.2), (2) a very accurate image selection process - in contrast to human intuition - is not crucial and a broader category definition can help to bridge the domain gap, (3) the usage of blending methods is beneficial compared to simple copy-and-paste. We made our full code for scraping, image composition and training publicly available at https://a-nau.github.io/parcel2d.

下载PDF全文

下载文献需遵守相关版权规定

论文标题