关于Lipschitz驱动的排练在不断学习中的有效性

论文标题

关于Lipschitz驱动的排练在不断学习中的有效性

On the Effectiveness of Lipschitz-Driven Rehearsal in Continual Learning

论文作者

Bonicelli, Lorenzo, Boschini, Matteo, Porrello, Angelo, Spampinato, Concetto, Calderara, Simone

论文摘要

彩排方法通过持续学习（CL）从业人员享有巨大的知名度。这些方法从以前遇到的数据分布中收集样本。随后，他们反复优化后者，以防止灾难性遗忘。这项工作引起了人们对这种广泛实践的隐藏陷阱的关注：对一小少数数据库的重复优化不可避免地会导致紧密而不稳定的决策界限，这是对概括的主要障碍。为了解决这个问题，我们提出了Lipschitz驱动的彩排（LIDER），这是一个替代目标，它通过限制其层的Lipschitz常数W.R.T.来诱导骨干网络的平滑度。重播示例。通过广泛的实验，我们表明，在存在和不存在预训练的情况下，应用Lider可以为多个数据集提供稳定的性能增益，以跨多个数据集提供稳定的性能增益。通过其他消融实验，我们突出了CL中缓冲区过度拟合的特殊方面，并更好地表征了Lider产生的效果。代码可从https://github.com/aimagelab/lider获得

Rehearsal approaches enjoy immense popularity with Continual Learning (CL) practitioners. These methods collect samples from previously encountered data distributions in a small memory buffer; subsequently, they repeatedly optimize on the latter to prevent catastrophic forgetting. This work draws attention to a hidden pitfall of this widespread practice: repeated optimization on a small pool of data inevitably leads to tight and unstable decision boundaries, which are a major hindrance to generalization. To address this issue, we propose Lipschitz-DrivEn Rehearsal (LiDER), a surrogate objective that induces smoothness in the backbone network by constraining its layer-wise Lipschitz constants w.r.t. replay examples. By means of extensive experiments, we show that applying LiDER delivers a stable performance gain to several state-of-the-art rehearsal CL methods across multiple datasets, both in the presence and absence of pre-training. Through additional ablative experiments, we highlight peculiar aspects of buffer overfitting in CL and better characterize the effect produced by LiDER. Code is available at https://github.com/aimagelab/LiDER

下载PDF全文

下载文献需遵守相关版权规定

论文标题