论文标题
哪些形状具有表示形式?探索数据集,架构和培训
What shapes feature representations? Exploring datasets, architectures, and training
论文作者
论文摘要
在自然主义的学习问题中,模型的输入包含广泛的功能,有些对手头的任务有用,而另一些则没有。在有用的功能中,模型使用哪些功能?在任务 - 近关系功能中,模型代表哪些功能?这些问题的答案对于理解模型决策的基础以及学习多功能,适应性表示的模型的基础很重要,而不是原始培训任务。我们使用合成数据集研究了这些问题,其中可以直接控制输入功能的任务 - 功能。我们发现,当两个功能冗余地预测标签时,模型优先代表一个标签,其偏好反映了未经训练模型最线性解码的内容。在训练中,与任务相关的功能得到了增强,并且任务IRRELELELELERELELERELERELED功能被部分抑制。有趣的是,在某些情况下,更容易,弱的预测性功能可以抑制更强烈的预测性,但更困难。此外,经过训练的模型,可以识别简单和硬功能,学习与仅使用简单功能的模型最相似的表示形式。此外,与硬功能相比,简单的功能可导致模型运行的更一致的表示。最后,模型与未经训练的模型具有更大的代表性相似性,而不是在其他任务上训练的模型。我们的结果突出了确定哪个具有模型代表的复杂过程。
In naturalistic learning problems, a model's input contains a wide range of features, some useful for the task at hand, and others not. Of the useful features, which ones does the model use? Of the task-irrelevant features, which ones does the model represent? Answers to these questions are important for understanding the basis of models' decisions, as well as for building models that learn versatile, adaptable representations useful beyond the original training task. We study these questions using synthetic datasets in which the task-relevance of input features can be controlled directly. We find that when two features redundantly predict the labels, the model preferentially represents one, and its preference reflects what was most linearly decodable from the untrained model. Over training, task-relevant features are enhanced, and task-irrelevant features are partially suppressed. Interestingly, in some cases, an easier, weakly predictive feature can suppress a more strongly predictive, but more difficult one. Additionally, models trained to recognize both easy and hard features learn representations most similar to models that use only the easy feature. Further, easy features lead to more consistent representations across model runs than do hard features. Finally, models have greater representational similarity to an untrained model than to models trained on a different task. Our results highlight the complex processes that determine which features a model represents.