具有重要性标签的RKH中的梯度下降

论文标题

具有重要性标签的RKH中的梯度下降

Gradient Descent in RKHS with Importance Labeling

论文作者

Murata, Tomoya, Suzuki, Taiji

论文摘要

标签成本通常很昂贵，并且是监督学习的基本限制。在本文中，我们研究了重要的标签问题，其中我们将获得许多未标记的数据，并选择有限数量的数据，该数据要从未标记的数据中标记，然后在选定的数据上执行学习算法。我们提出了一种新的重要性标签方案，该方案可以有效地在重现内核希尔伯特空间（RKHS）中有效地选择最小二乘数据的无标记数据子集。我们分析了梯度下降的概括误差与我们的标记方案相结合，并表明所提出的算法在更广泛的设置中达到了最佳的收敛速率，尤其是在较小的标签噪声设置中比通常的均匀采样方案具有更好的概括能力。数值实验验证了我们的理论发现。

Labeling cost is often expensive and is a fundamental limitation of supervised learning. In this paper, we study importance labeling problem, in which we are given many unlabeled data and select a limited number of data to be labeled from the unlabeled data, and then a learning algorithm is executed on the selected one. We propose a new importance labeling scheme that can effectively select an informative subset of unlabeled data in least squares regression in Reproducing Kernel Hilbert Spaces (RKHS). We analyze the generalization error of gradient descent combined with our labeling scheme and show that the proposed algorithm achieves the optimal rate of convergence in much wider settings and especially gives much better generalization ability in a small label noise setting than the usual uniform sampling scheme. Numerical experiments verify our theoretical findings.

下载PDF全文

下载文献需遵守相关版权规定

论文标题