异质性假设：找到层的差异化网络体系结构

论文标题

异质性假设：找到层的差异化网络体系结构

The Heterogeneity Hypothesis: Finding Layer-Wise Differentiated Network Architectures

论文作者

Li, Yawei, Li, Wen, Danelljan, Martin, Zhang, Kai, Gu, Shuhang, Van Gool, Luc, Timofte, Radu

论文摘要

在本文中，我们解决了卷积神经网络设计的问题。我们没有专注于整体体系结构的设计，而是研究了通常被忽略的设计空间，即调整预定义网络的通道配置。我们发现，可以通过缩小扩大的基线网络并导致出色的性能来实现此调整。基于此，我们阐明了异质性假设：使用相同的训练协议，存在一个层次的差异化网络体系结构（LW-DNA），可以胜过具有常规通道配置的原始网络，但模型复杂度较低。与原始网络相比，确定没有额外的计算成本或培训时间的LW-DNA模型。该约束导致受控的实验，这些实验将焦点引导到层面特定的通道配置的重要性。 LW-DNA模型具有与过度拟合有关的优点，即模型复杂性与数据集大小之间的相对关系。实验是在各种网络和数据集上进行的，以进行图像分类，视觉跟踪和图像恢复。最终的LW-DNA模型始终优于基线模型。代码可从https://github.com/ofsoundof/heterogeneity_hypothesis获得。

In this paper, we tackle the problem of convolutional neural network design. Instead of focusing on the design of the overall architecture, we investigate a design space that is usually overlooked, i.e. adjusting the channel configurations of predefined networks. We find that this adjustment can be achieved by shrinking widened baseline networks and leads to superior performance. Based on that, we articulate the heterogeneity hypothesis: with the same training protocol, there exists a layer-wise differentiated network architecture (LW-DNA) that can outperform the original network with regular channel configurations but with a lower level of model complexity. The LW-DNA models are identified without extra computational cost or training time compared with the original network. This constraint leads to controlled experiments which direct the focus to the importance of layer-wise specific channel configurations. LW-DNA models come with advantages related to overfitting, i.e. the relative relationship between model complexity and dataset size. Experiments are conducted on various networks and datasets for image classification, visual tracking and image restoration. The resultant LW-DNA models consistently outperform the baseline models. Code is available at https://github.com/ofsoundof/Heterogeneity_Hypothesis.

下载PDF全文

下载文献需遵守相关版权规定

论文标题