论文标题
元学习参数化技能
Meta-Learning Parameterized Skills
论文作者
论文摘要
我们提出了一种新型的参数化技能学习算法,旨在学习可转移的参数化技能,并将其合成为新的动作空间,以支持长期任务中有效学习。我们建议利用以轨迹为中心的平滑度术语来利用范围的元素元素,以学习一组参数化技能。我们的代理商可以使用这些学习的技能来构建一个三级分层框架,该框架对时间扩展的参数化动作马尔可夫决策过程进行建模。我们从经验上证明,所提出的算法使代理可以解决一组困难的长途(障碍物和机器人操纵)任务。
We propose a novel parameterized skill-learning algorithm that aims to learn transferable parameterized skills and synthesize them into a new action space that supports efficient learning in long-horizon tasks. We propose to leverage off-policy Meta-RL combined with a trajectory-centric smoothness term to learn a set of parameterized skills. Our agent can use these learned skills to construct a three-level hierarchical framework that models a Temporally-extended Parameterized Action Markov Decision Process. We empirically demonstrate that the proposed algorithms enable an agent to solve a set of difficult long-horizon (obstacle-course and robot manipulation) tasks.