元学习参数化技能

论文标题

元学习参数化技能

Meta-Learning Parameterized Skills

论文作者

Fu, Haotian, Yu, Shangqun, Tiwari, Saket, Littman, Michael, Konidaris, George

论文摘要

我们提出了一种新型的参数化技能学习算法，旨在学习可转移的参数化技能，并将其合成为新的动作空间，以支持长期任务中有效学习。我们建议利用以轨迹为中心的平滑度术语来利用范围的元素元素，以学习一组参数化技能。我们的代理商可以使用这些学习的技能来构建一个三级分层框架，该框架对时间扩展的参数化动作马尔可夫决策过程进行建模。我们从经验上证明，所提出的算法使代理可以解决一组困难的长途（障碍物和机器人操纵）任务。

We propose a novel parameterized skill-learning algorithm that aims to learn transferable parameterized skills and synthesize them into a new action space that supports efficient learning in long-horizon tasks. We propose to leverage off-policy Meta-RL combined with a trajectory-centric smoothness term to learn a set of parameterized skills. Our agent can use these learned skills to construct a three-level hierarchical framework that models a Temporally-extended Parameterized Action Markov Decision Process. We empirically demonstrate that the proposed algorithms enable an agent to solve a set of difficult long-horizon (obstacle-course and robot manipulation) tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题