通过分散的自动驾驶汽车和多代理RL优化混合自主交通流量

论文标题

通过分散的自动驾驶汽车和多代理RL优化混合自主交通流量

Optimizing Mixed Autonomy Traffic Flow With Decentralized Autonomous Vehicles and Multi-Agent RL

论文作者

Vinitsky, Eugene, Lichtle, Nathan, Parvate, Kanaad, Bayen, Alexandre

论文摘要

我们研究了在混合自主权环境中使用完全分散的控制方案来改善瓶颈吞吐量的能力。我们考虑了改善旧金山 - 奥克兰湾大桥尺度模型的吞吐量的问题：一个两阶段的瓶颈，其中四个车道减少到两道，然后减少到一条。尽管在集中式环境中进行了大量研究，研究了瓶颈控制的变体，但对充满挑战的多代理环境的研究较少，在这种情况下，大量相互作用的AVS导致了强大的优化难度，用于增强学习方法。我们将多代理增强算法应用于此问题，并证明可以实现瓶颈吞吐量的显着改善，从5 \％渗透率的20 \％从20 \％以40 \％的渗透率达到33 \％。我们将结果与手工设计的反馈控制器进行了比较，并证明尽管进行了广泛的调整，但我们的结果表现大大优于反馈控制器。此外，我们证明了基于RL的控制器采用强大的策略，该策略跨渗透率，而反馈控制器在穿透率变化后立即退化。我们研究了行动和观察权力下放化的可行性，并证明了使用纯局部感应的有效策略。最后，我们在https://github.com/eugenevinitsky/decentralized_bottlenecks上打开代码。

We study the ability of autonomous vehicles to improve the throughput of a bottleneck using a fully decentralized control scheme in a mixed autonomy setting. We consider the problem of improving the throughput of a scaled model of the San Francisco-Oakland Bay Bridge: a two-stage bottleneck where four lanes reduce to two and then reduce to one. Although there is extensive work examining variants of bottleneck control in a centralized setting, there is less study of the challenging multi-agent setting where the large number of interacting AVs leads to significant optimization difficulties for reinforcement learning methods. We apply multi-agent reinforcement algorithms to this problem and demonstrate that significant improvements in bottleneck throughput, from 20\% at a 5\% penetration rate to 33\% at a 40\% penetration rate, can be achieved. We compare our results to a hand-designed feedback controller and demonstrate that our results sharply outperform the feedback controller despite extensive tuning. Additionally, we demonstrate that the RL-based controllers adopt a robust strategy that works across penetration rates whereas the feedback controllers degrade immediately upon penetration rate variation. We investigate the feasibility of both action and observation decentralization and demonstrate that effective strategies are possible using purely local sensing. Finally, we open-source our code at https://github.com/eugenevinitsky/decentralized_bottlenecks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题