分享还是不分享？性能保证和跨机器人经验转移的不对称性质

论文标题

分享还是不分享？性能保证和跨机器人经验转移的不对称性质

To Share or Not to Share? Performance Guarantees and the Asymmetric Nature of Cross-Robot Experience Transfer

论文作者

Sorocky, Michael J., Zhou, Siqi, Schoellig, Angela P.

论文摘要

在机器人文献中，已经在不同的基于学习的控制框架中提出了经验转移，以最大程度地降低与培训机器人相关的成本和风险。尽管各种作品表明，从源机器人转移先前的经验以改善或加速目标机器人的学习的可行性，但通常不能保证经验转移可以改善目标机器人的性能。在实践中，在物理机器人进行测试之前，通常不知道转移经验的功效。这种反复试验的方法可能非常不安全且效率低下。在我们以前的工作的基础上，在本文中，我们考虑了一个反模块传输学习框架，其中源机器人系统的逆模块被转移到目标机器人系统中，以改善其在任意轨迹上的跟踪性能。当源逆模块传输到目标机器人并提出基于贝叶斯优化的算法以估算数据键时，我们将在跟踪误差上得出一个理论结合。我们进一步强调了文献中经常被忽略的跨机器人经验转移的不对称性质。我们证明了我们在四项实验中的方法，并表明我们可以保证目标机器人对跟踪随机周期性轨迹的正转移。

In the robotics literature, experience transfer has been proposed in different learning-based control frameworks to minimize the costs and risks associated with training robots. While various works have shown the feasibility of transferring prior experience from a source robot to improve or accelerate the learning of a target robot, there are usually no guarantees that experience transfer improves the performance of the target robot. In practice, the efficacy of transferring experience is often not known until it is tested on physical robots. This trial-and-error approach can be extremely unsafe and inefficient. Building on our previous work, in this paper we consider an inverse module transfer learning framework, where the inverse module of a source robot system is transferred to a target robot system to improve its tracking performance on arbitrary trajectories. We derive a theoretical bound on the tracking error when a source inverse module is transferred to the target robot and propose a Bayesian-optimization-based algorithm to estimate this bound from data. We further highlight the asymmetric nature of cross-robot experience transfer that has often been neglected in the literature. We demonstrate our approach in quadrotor experiments and show that we can guarantee positive transfer on the target robot for tracking random periodic trajectories.

下载PDF全文

下载文献需遵守相关版权规定

论文标题