交互网络：使用增强学习者训练其他机器学习算法

论文标题

交互网络：使用增强学习者训练其他机器学习算法

Interaction Networks: Using a Reinforcement Learner to train other Machine Learning algorithms

论文作者

Dietz, Florian

论文摘要

大脑中神经元的接线比当代人工神经网络中连接的接线更灵活。这种额外的灵活性对于有效的问题解决和学习很重要。本文介绍了交互网络。交互网络旨在捕获一些额外的灵活性。交互网络由一组常规神经网络，一组记忆位置以及DQN或其他强化学习者组成。 DQN决定何时执行每个神经网络以及在哪些内存位置。这样，可以在不同的数据上培训各个神经网络，以完成不同的任务。同时，各个网络的结果影响了加强学习者的决策过程。这会导致反馈循环，使DQN可以执行改善自己决策的动作。任何现有类型的神经网络都可以在交互网络中整体复制，只有恒定的计算开销。然后，交互网络可以引入其他功能，以进一步提高性能。这些使该算法更加灵活和一般，但以更难训练为代价。在本文中，思想实验用于探讨如何使用交互网络的其他能力来改善各种现有类型的神经网络。已经进行了几项实验，以证明该概念是正确的。这些表明基本思想有效，但它们也揭示了许多在传统神经网络中没有出现的挑战，这使得交互网络很难训练。需要进行进一步的研究以减轻这些问题。本文概述了许多有希望的研究途径。

The wiring of neurons in the brain is more flexible than the wiring of connections in contemporary artificial neural networks. It is possible that this extra flexibility is important for efficient problem solving and learning. This paper introduces the Interaction Network. Interaction Networks aim to capture some of this extra flexibility. An Interaction Network consists of a collection of conventional neural networks, a set of memory locations, and a DQN or other reinforcement learner. The DQN decides when each of the neural networks is executed, and on what memory locations. In this way, the individual neural networks can be trained on different data, for different tasks. At the same time, the results of the individual networks influence the decision process of the reinforcement learner. This results in a feedback loop that allows the DQN to perform actions that improve its own decision-making. Any existing type of neural network can be reproduced in an Interaction Network in its entirety, with only a constant computational overhead. Interaction Networks can then introduce additional features to improve performance further. These make the algorithm more flexible and general, but at the expense of being harder to train. In this paper, thought experiments are used to explore how the additional abilities of Interaction Networks could be used to improve various existing types of neural networks. Several experiments have been run to prove that the concept is sound. These show that the basic idea works, but they also reveal a number of challenges that do not appear in conventional neural networks, which make Interaction Networks very hard to train. Further research needs to be done to alleviate these issues. A number of promising avenues of research to achieve this are outlined in this paper.

下载PDF全文

下载文献需遵守相关版权规定

论文标题