与合成梯度的经验贝叶斯跨性元学习

论文标题

与合成梯度的经验贝叶斯跨性元学习

Empirical Bayes Transductive Meta-Learning with Synthetic Gradients

论文作者

Hu, Shell Xu, Moreno, Pablo G., Xiao, Yang, Shen, Xi, Obozinski, Guillaume, Lawrence, Neil D., Damianou, Andreas

论文摘要

我们提出了一种元学习方法，该方法通过利用未标记的查询集除了支持集合设置为每个任务生成更强大的模型外，还可以从多个任务中学习。为了开发我们的框架，我们重新审视了多任务学习的经验贝叶斯公式。经验贝叶斯的边际对数可能性的证据下限是分解的，这是每个任务的查询集上的变异后部和真实后验之间的局部KL差异的总和。我们得出了一种新颖的摊销变异推断，该推断通过元模型将所有变异后代耦合，该元模型由合成梯度网络和初始化网络组成。每个变异后验都是从合成梯度下降到近似查询集中的真实后验的，尽管我们无法访问真实的梯度。我们在偶发性少量分类的迷你象征和CIFAR-FS基准上的结果优于先前的最新方法。此外，我们进行了两个零射学习实验，以进一步探索合成梯度的潜力。

We propose a meta-learning approach that learns from multiple tasks in a transductive setting, by leveraging the unlabeled query set in addition to the support set to generate a more powerful model for each task. To develop our framework, we revisit the empirical Bayes formulation for multi-task learning. The evidence lower bound of the marginal log-likelihood of empirical Bayes decomposes as a sum of local KL divergences between the variational posterior and the true posterior on the query set of each task. We derive a novel amortized variational inference that couples all the variational posteriors via a meta-model, which consists of a synthetic gradient network and an initialization network. Each variational posterior is derived from synthetic gradient descent to approximate the true posterior on the query set, although where we do not have access to the true gradient. Our results on the Mini-ImageNet and CIFAR-FS benchmarks for episodic few-shot classification outperform previous state-of-the-art methods. Besides, we conduct two zero-shot learning experiments to further explore the potential of the synthetic gradient.

下载PDF全文

下载文献需遵守相关版权规定

论文标题