论文标题
重新访问非结构化修剪的损失建模
Revisiting Loss Modelling for Unstructured Pruning
论文作者
论文摘要
通过从深神网络中删除参数,非结构化的修剪方法旨在降低记忆足迹和计算成本,同时保持预测准确性。为了解决这个原本棘手的问题,其中许多方法使用第一阶或二阶Taylor扩展对损失格局进行建模,以确定可以丢弃哪些参数。我们重新审视非结构化修剪的损失建模:我们表明了确保修剪步骤的位置的重要性。我们从系统地比较了一阶和二阶泰勒的扩展,并从经验上表明两者都可以达到相似的性能水平。最后,我们表明,更好地保留原始网络功能并不一定会在微调后转移到更好的性能网络中,这表明仅考虑修剪对损失的影响可能不是设计良好的修剪标准的足够目标。
By removing parameters from deep neural networks, unstructured pruning methods aim at cutting down memory footprint and computational cost, while maintaining prediction accuracy. In order to tackle this otherwise intractable problem, many of these methods model the loss landscape using first or second order Taylor expansions to identify which parameters can be discarded. We revisit loss modelling for unstructured pruning: we show the importance of ensuring locality of the pruning steps. We systematically compare first and second order Taylor expansions and empirically show that both can reach similar levels of performance. Finally, we show that better preserving the original network function does not necessarily transfer to better performing networks after fine-tuning, suggesting that only considering the impact of pruning on the loss might not be a sufficient objective to design good pruning criteria.