论文标题
通过共同信息最大化紧急合作
Emergent cooperation through mutual information maximization
论文作者
论文摘要
随着人工智能系统在我们的社会中变得无处不在,其设计师很快就必须开始考虑其社会层面,因为其中许多系统必须在它们之间进行互动才能有效地工作。考虑到这一点,我们提出了一种分散的深入强化学习算法,用于设计合作多机构系统。该算法基于以下假设:高度相关的动作是合作系统的特征,因此,我们提出了插入辅助目标的辅助目标,即在学习问题中的代理人的动作之间最大化相互信息。我们的系统应用于社会困境,一个问题的最佳解决方案要求,尽管每个代理人的个人目标有不同的个人目标,但代理人仍可以合作以最大化宏观的性能功能。通过将提出的系统的性能与没有辅助目标的系统进行比较,我们得出结论,代理人之间相互信息的最大化促进了社会困境中合作的出现。
With artificial intelligence systems becoming ubiquitous in our society, its designers will soon have to start to consider its social dimension, as many of these systems will have to interact among them to work efficiently. With this in mind, we propose a decentralized deep reinforcement learning algorithm for the design of cooperative multi-agent systems. The algorithm is based on the hypothesis that highly correlated actions are a feature of cooperative systems, and hence, we propose the insertion of an auxiliary objective of maximization of the mutual information between the actions of agents in the learning problem. Our system is applied to a social dilemma, a problem whose optimal solution requires that agents cooperate to maximize a macroscopic performance function despite the divergent individual objectives of each agent. By comparing the performance of the proposed system to a system without the auxiliary objective, we conclude that the maximization of mutual information among agents promotes the emergence of cooperation in social dilemmas.