论文标题

将超人AI与人类行为相结合:国际象棋作为模型系统

Aligning Superhuman AI with Human Behavior: Chess as a Model System

论文作者

McIlroy-Young, Reid, Sen, Siddhartha, Kleinberg, Jon, Anderson, Ashton

论文摘要

随着人工智能变得越来越聪明 - 在某些情况下,实现超人的表现 - 人类越来越有潜力与算法学习和合作。但是,AI系统处理问题的方式通常与人们的方式不同,因此可能无法解释且难以学习。弥合人工智能之间的这一差距的关键步骤是建模构成人类行为的颗粒状作用,而不是简单地与人类绩效相匹配。 我们在人工智能历史悠久的模型系统中追求这一目标:国际象棋。国际象棋棋手在游戏过程中做出决定时的总体表现会展开。玩家在每个技能层面上在线玩的数亿场游戏形成了丰富的数据来源,其中这些决策及其确切的上下文是在细节上记录的。将现有的国际象棋引擎应用于此数据,包括Alphazero的开源实施,我们发现它们不能很好地预测人类的动作。 我们开发并介绍了Maia,这是一个经过人类国际象棋游戏培训的Alpha-Zero的定制版本,它可以预测人类的移动的准确性要高得多,并且在预测玩家在特定技能水平上以可调的方式预测的决策时,可以实现最高的准确性。为了预测人类是否会在下一步行动中犯一个大错误的双重任务,我们开发了一个深厚的神经网络,可以极大地超过竞争基线。综上所述,我们的结果表明,通过首先准确地对颗粒状的人类决策进行建模,在设计人工智能系统时具有实质性的希望。

As artificial intelligence becomes increasingly intelligent---in some cases, achieving superhuman performance---there is growing potential for humans to learn from and collaborate with algorithms. However, the ways in which AI systems approach problems are often different from the ways people do, and thus may be uninterpretable and hard to learn from. A crucial step in bridging this gap between human and artificial intelligence is modeling the granular actions that constitute human behavior, rather than simply matching aggregate human performance. We pursue this goal in a model system with a long history in artificial intelligence: chess. The aggregate performance of a chess player unfolds as they make decisions over the course of a game. The hundreds of millions of games played online by players at every skill level form a rich source of data in which these decisions, and their exact context, are recorded in minute detail. Applying existing chess engines to this data, including an open-source implementation of AlphaZero, we find that they do not predict human moves well. We develop and introduce Maia, a customized version of Alpha-Zero trained on human chess games, that predicts human moves at a much higher accuracy than existing engines, and can achieve maximum accuracy when predicting decisions made by players at a specific skill level in a tuneable way. For a dual task of predicting whether a human will make a large mistake on the next move, we develop a deep neural network that significantly outperforms competitive baselines. Taken together, our results suggest that there is substantial promise in designing artificial intelligence systems with human collaboration in mind by first accurately modeling granular human decision-making.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源