论文标题

理论启发的路径登记差异网络体系结构搜索

Theory-Inspired Path-Regularized Differential Network Architecture Search

论文作者

Zhou, Pan, Xiong, Caiming, Socher, Richard, Hoi, Steven C. H.

论文摘要

尽管具有较高的搜索效率,但差异架构搜索(DARTS)通常会选择具有主导的跳过连接的网络体系结构,从而导致性能退化。但是,对这个问题的理论理解仍然不存在,以原则上的方式阻碍了更先进的方法的发展。在这项工作中,我们通过理论上分析了各种操作的影响,例如卷积,跳过连接和零操作,与网络优化。我们证明,具有跳过连接更多的架构可以比其他候选者更快地收敛,因此由飞镖选择。从理论上讲,这一结果首次揭示了跳过连接对快速网络优化的影响及其比飞镖其他类型的操作的竞争优势。然后,我们提出了一个由理论启发的路径定型飞镖组成的,该飞镖由两个关键模块组成:(i)针对每项操作引入的差异群体结构的稀疏二元门,以避免在操作之间进行不公平的竞争,(ii)用于促进深层建筑探索的路径深度正则化,通常会在探索我们的理论和探索方面探索较低的搜索范围。图像分类任务的实验结果验证其优势。

Despite its high search efficiency, differential architecture search (DARTS) often selects network architectures with dominated skip connections which lead to performance degradation. However, theoretical understandings on this issue remain absent, hindering the development of more advanced methods in a principled way. In this work, we solve this problem by theoretically analyzing the effects of various types of operations, e.g. convolution, skip connection and zero operation, to the network optimization. We prove that the architectures with more skip connections can converge faster than the other candidates, and thus are selected by DARTS. This result, for the first time, theoretically and explicitly reveals the impact of skip connections to fast network optimization and its competitive advantage over other types of operations in DARTS. Then we propose a theory-inspired path-regularized DARTS that consists of two key modules: (i) a differential group-structured sparse binary gate introduced for each operation to avoid unfair competition among operations, and (ii) a path-depth-wise regularization used to incite search exploration for deep architectures that often converge slower than shallow ones as shown in our theory and are not well explored during the search. Experimental results on image classification tasks validate its advantages.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源