具有任意规范的多种学习

论文标题

具有任意规范的多种学习

Manifold learning with arbitrary norms

论文作者

Kileel, Joe, Moscovich, Amit, Zelesko, Nathan, Singer, Amit

论文摘要

多种学习方法在降低非线性维度和其他任务中起着重要的作用，涉及具有较低内在维度的高维数据集。这些方法中的许多是基于图的：它们将一个顶点与每个数据点相关联，并将加权边缘与每对加权边缘关联。现有理论表明，图表的拉普拉斯矩阵将数据歧管的拉普拉斯 - 贝特拉米操作员收敛，假设成对的亲和力是基于欧几里得的规范。在本文中，我们确定了使用$ \ textit {any} $ norm构建的图形laplacians的限制差分运算符。我们的证明涉及歧管的第二个基本形式与给定标准的单位球的凸几何形状之间的相互作用。为了证明非欧国人规范在多种学习中的潜在益处，我们考虑了映射具有连续变化的大分子运动的任务。在数值模拟中，我们表明，基于EarthMover的距离，经过修改的Laplacian eigenmaps算法优于经典的欧几里得laplacian eigenmaps，无论是在计算成本和恢复内在几何形状所需的样本量而言。

Manifold learning methods play a prominent role in nonlinear dimensionality reduction and other tasks involving high-dimensional data sets with low intrinsic dimensionality. Many of these methods are graph-based: they associate a vertex with each data point and a weighted edge with each pair. Existing theory shows that the Laplacian matrix of the graph converges to the Laplace-Beltrami operator of the data manifold, under the assumption that the pairwise affinities are based on the Euclidean norm. In this paper, we determine the limiting differential operator for graph Laplacians constructed using $\textit{any}$ norm. Our proof involves an interplay between the second fundamental form of the manifold and the convex geometry of the given norm's unit ball. To demonstrate the potential benefits of non-Euclidean norms in manifold learning, we consider the task of mapping the motion of large molecules with continuous variability. In a numerical simulation we show that a modified Laplacian eigenmaps algorithm, based on the Earthmover's distance, outperforms the classic Euclidean Laplacian eigenmaps, both in terms of computational cost and the sample size needed to recover the intrinsic geometry.

下载PDF全文

下载文献需遵守相关版权规定

论文标题