论文标题
马尔可维亚线性随机近似和恒定步骤中的偏置和外推
Bias and Extrapolation in Markovian Linear Stochastic Approximation with Constant Stepsizes
论文作者
论文摘要
我们认为具有恒定步骤和马尔可夫数据的线性随机近似(LSA)。将数据和LSA迭代的联合过程视为一个时间均匀的马尔可夫链,我们证明了它在Wasserstein距离中的独特限制和固定分布的收敛性,并建立了非反应的几何收敛速率。此外,我们表明,该极限的偏置向量承认相对于步骤大小的无限级数扩展。因此,偏见与达到高阶条款的步骤大小成正比。该结果与I.I.D.下的LSA相反。数据,偏见消失了。在可逆的链环境中,我们提供了马尔可夫数据的偏见与混合时间之间关系的一般表征,确定它们彼此大致成比例。 虽然Polyak-Rupperter尾部平均降低了LSA迭代的方差,但不会影响偏见。以上表征使我们可以证明,使用Richardson-Romberg推断使用$ M \ ge 2 $步骤尺寸可以减少偏见,从而消除了偏见扩展中的$ M-1 $领先术语。在理论上和经验上,这种推断方案导致指数较小的偏差和改善的平方误差。我们的结果立即适用于具有线性函数近似,Markovian数据和常数步骤的时间差学习算法。
We consider Linear Stochastic Approximation (LSA) with a constant stepsize and Markovian data. Viewing the joint process of the data and LSA iterate as a time-homogeneous Markov chain, we prove its convergence to a unique limiting and stationary distribution in Wasserstein distance and establish non-asymptotic, geometric convergence rates. Furthermore, we show that the bias vector of this limit admits an infinite series expansion with respect to the stepsize. Consequently, the bias is proportional to the stepsize up to higher order terms. This result stands in contrast with LSA under i.i.d. data, for which the bias vanishes. In the reversible chain setting, we provide a general characterization of the relationship between the bias and the mixing time of the Markovian data, establishing that they are roughly proportional to each other. While Polyak-Ruppert tail-averaging reduces the variance of the LSA iterates, it does not affect the bias. The above characterization allows us to show that the bias can be reduced using Richardson-Romberg extrapolation with $m\ge 2$ stepsizes, which eliminates the $m-1$ leading terms in the bias expansion. This extrapolation scheme leads to an exponentially smaller bias and an improved mean squared error, both in theory and empirically. Our results immediately apply to the Temporal Difference learning algorithm with linear function approximation, Markovian data, and constant stepsizes.