论文标题
数据的光芒:通过量子动力学的几何数据分析
Shining light on data: Geometric data analysis through quantum dynamics
论文作者
论文摘要
实验科学在很大程度上取决于我们组织和解释高维数据集的能力。观察到的变量之间的自然定律,保护原则和相互依存关系产生的几何结构,在数据集上具有较少的自由度。我们将半经典和微局部分析的框架介绍给数据分析,并开发出一种新颖但自然的不确定性原理,用于在数据中提取此几何结构的精细特征,至关重要地取决于数据驱动的近似值,以实现几何学光学的量子机械过程。这导致了第一种可拖动算法,用于在歧管假设下具有严格概率收敛速率的数据歧管上的波动力学和地球化学近似。我们在现实世界数据集上演示了我们的算法,包括对COVID-19大流行期间人口流动性信息的分析,以提高降低维度的四倍,比现有的最新技术降低,并揭示了整个数据集的不到1.2%的异常行为。我们的工作启动了数据驱动的量子动态以分析数据集的研究,并概述了未来的一些研究方向。
Experimental sciences have come to depend heavily on our ability to organize and interpret high-dimensional datasets. Natural laws, conservation principles, and inter-dependencies among observed variables yield geometric structure, with fewer degrees of freedom, on the dataset. We introduce the frameworks of semiclassical and microlocal analysis to data analysis and develop a novel, yet natural uncertainty principle for extracting fine-scale features of this geometric structure in data, crucially dependent on data-driven approximations to quantum mechanical processes underlying geometric optics. This leads to the first tractable algorithm for approximation of wave dynamics and geodesics on data manifolds with rigorous probabilistic convergence rates under the manifold hypothesis. We demonstrate our algorithm on real-world datasets, including an analysis of population mobility information during the COVID-19 pandemic to achieve four-fold improvement in dimensionality reduction over existing state-of-the-art and reveal anomalous behavior exhibited by less than 1.2% of the entire dataset. Our work initiates the study of data-driven quantum dynamics for analyzing datasets, and we outline several future directions for research.