使用加固学习的多机构停车场

论文标题

使用加固学习的多机构停车场

Multi-Agent Car Parking using Reinforcement Learning

论文作者

Tanner, Omar

论文摘要

随着自动驾驶行业的发展，自动驾驶汽车群体的潜在相互作用也随之增长。结合人工智能和模拟的进步，可以模拟这样的群体，可以学习控制内部汽车的安全模型。这项研究将强化学习应用于多代理停车场的问题，在那里，汽车集团的目的是有效地停车，同时保持安全和理性。利用强大的工具和机器学习框架，我们以马尔可夫决策过程的形式设计和实施了灵活的停车环境，并与独立的学习者一起利用了多方面的交流。我们实施了一套工具来进行大规模执行实验，从而取得了超过98.1％成功率的高达7辆汽车的模型，从而超过了现有的单代代理模型。我们还获得了几种与汽车在我们环境中表现出的竞争性和协作行为有关的结果，并具有不同的密度和沟通水平。值得注意的是，我们发现了一种没有竞争的协作形式，以及一种“泄漏”的合作形式，在没有足够状态的情况下，代理商进行了协作。这种工作在自动驾驶和车队管理行业中具有许多潜在的应用，并为将增强学习用于多机构停车场提供了几种有用的技术和基准。

As the industry of autonomous driving grows, so does the potential interaction of groups of autonomous cars. Combined with the advancement of Artificial Intelligence and simulation, such groups can be simulated, and safety-critical models can be learned controlling the cars within. This study applies reinforcement learning to the problem of multi-agent car parking, where groups of cars aim to efficiently park themselves, while remaining safe and rational. Utilising robust tools and machine learning frameworks, we design and implement a flexible car parking environment in the form of a Markov decision process with independent learners, exploiting multi-agent communication. We implement a suite of tools to perform experiments at scale, obtaining models parking up to 7 cars with over a 98.1% success rate, significantly beating existing single-agent models. We also obtain several results relating to competitive and collaborative behaviours exhibited by the cars in our environment, with varying densities and levels of communication. Notably, we discover a form of collaboration that cannot arise without competition, and a 'leaky' form of collaboration whereby agents collaborate without sufficient state. Such work has numerous potential applications in the autonomous driving and fleet management industries, and provides several useful techniques and benchmarks for the application of reinforcement learning to multi-agent car parking.

下载PDF全文

下载文献需遵守相关版权规定

论文标题