论文标题
可证明在具有中点混音的多视图数据中学习各种功能
Provably Learning Diverse Features in Multi-View Data with Midpoint Mixup
论文作者
论文摘要
混音是一种数据增强技术,它依赖于使用数据点及其标签的随机凸组合进行训练。近年来,由于其在概括和鲁棒性方面的经验风险最小化表现出了比经验风险最小化,因此Mixup已成为用于培训最先进图像分类模型的标准原始性。在这项工作中,我们尝试从功能学习的角度来解释一些成功。我们将注意力集中在每个类可能具有多个相关特征(或视图)的分类问题上,这些功能可用于正确预测课程。我们的主要理论结果表明,对于每个类别具有两个功能的非平凡的数据分布,使用经验风险最小化训练2层卷积网络可以使几乎所有课程都学习一个功能,同时培训具有特定的混合实例化的培训可以为每个班级学习两个功能。我们还从经验上表明,这些理论见解扩展到修改为具有多个功能的图像基准的实际设置。
Mixup is a data augmentation technique that relies on training using random convex combinations of data points and their labels. In recent years, Mixup has become a standard primitive used in the training of state-of-the-art image classification models due to its demonstrated benefits over empirical risk minimization with regards to generalization and robustness. In this work, we try to explain some of this success from a feature learning perspective. We focus our attention on classification problems in which each class may have multiple associated features (or views) that can be used to predict the class correctly. Our main theoretical results demonstrate that, for a non-trivial class of data distributions with two features per class, training a 2-layer convolutional network using empirical risk minimization can lead to learning only one feature for almost all classes while training with a specific instantiation of Mixup succeeds in learning both features for every class. We also show empirically that these theoretical insights extend to the practical settings of image benchmarks modified to have multiple features.