论文标题
神经积分方程
Neural Integral Equations
论文作者
论文摘要
具有长距离时空依赖性的非线性操作员对于对跨科学进行建模复杂系统的建模至关重要,但是在机器学习中学习这些非本地运算符仍然具有挑战性。建模此类非本地系统的积分方程(IES)在物理,化学,生物学和工程中具有广泛的应用。我们介绍了神经积分方程(NIE),这是一种使用IE求解器从数据中学习未知积分运算符的方法。为了提高可伸缩性和模型能力,我们还提出了注意力神经积分方程(ANIE),该方程用自我注意力取代了积分。这两个模型都基于第二种积分方程的理论,其中不确定的既出现在积分运算符的内部和外部。我们提供了理论分析,表明自我注意力如何在轻度的规律性假设下近似积分运算符,进一步加深了先前报道的变压器与集成之间的连接,并得出了积分运算符的相应近似结果。通过关于合成和现实世界数据的数值基准,包括Lotka-Volterra,Navier-Stokes和Burgers的方程以及大脑动力学和积分方程,我们展示了模型的功能及其能够得出可解释的动态嵌入的能力。我们的实验表明,Anie的表现优于现有方法,尤其是对于更长的时间间隔和更高的维度问题。我们的工作解决了非本地操作员的机器学习方面的关键差距,并为研究具有远距离依赖性的未知复杂系统提供了强大的工具。
Nonlinear operators with long distance spatiotemporal dependencies are fundamental in modeling complex systems across sciences, yet learning these nonlocal operators remains challenging in machine learning. Integral equations (IEs), which model such nonlocal systems, have wide ranging applications in physics, chemistry, biology, and engineering. We introduce Neural Integral Equations (NIE), a method for learning unknown integral operators from data using an IE solver. To improve scalability and model capacity, we also present Attentional Neural Integral Equations (ANIE), which replaces the integral with self-attention. Both models are grounded in the theory of second kind integral equations, where the indeterminate appears both inside and outside the integral operator. We provide theoretical analysis showing how self-attention can approximate integral operators under mild regularity assumptions, further deepening previously reported connections between transformers and integration, and deriving corresponding approximation results for integral operators. Through numerical benchmarks on synthetic and real world data, including Lotka-Volterra, Navier-Stokes, and Burgers' equations, as well as brain dynamics and integral equations, we showcase the models' capabilities and their ability to derive interpretable dynamics embeddings. Our experiments demonstrate that ANIE outperforms existing methods, especially for longer time intervals and higher dimensional problems. Our work addresses a critical gap in machine learning for nonlocal operators and offers a powerful tool for studying unknown complex systems with long range dependencies.