论文标题
从BERT(和其他变压器模型)嵌入中得出上下文化的语义特征
Deriving Contextualised Semantic Features from BERT (and Other Transformer Model) Embeddings
论文作者
论文摘要
基于变压器体系结构(例如BERT)的模型标志着自然语言处理领域的至关重要的一步。重要的是,它们允许创建单词嵌入,以捕获上下文中有关单词的重要语义信息。但是,作为单个实体,这些嵌入很难解释,用于创建它们的模型被描述为不透明。 Binder及其同事提出了一个直观的嵌入空间,每个维度都基于65个核心语义特征之一。不幸的是,该空间仅适用于535个单词的小数据集,从而限制了其用途。先前的工作(Utsumi,2018年,2020年,Turton,Vinson&Smith,2020年)表明,粘合剂特征可以从静态嵌入中衍生而来,并成功地外推到大型新词汇。采取下一步,本文证明了粘合剂特征可以从BERT嵌入空间中得出。这提供了上下文化的粘合剂嵌入,可以帮助理解上下文中单词之间的语义差异。此外,它还提供了有关在BERT模型的不同层中如何表示语义特征的见解。
Models based on the transformer architecture, such as BERT, have marked a crucial step forward in the field of Natural Language Processing. Importantly, they allow the creation of word embeddings that capture important semantic information about words in context. However, as single entities, these embeddings are difficult to interpret and the models used to create them have been described as opaque. Binder and colleagues proposed an intuitive embedding space where each dimension is based on one of 65 core semantic features. Unfortunately, the space only exists for a small dataset of 535 words, limiting its uses. Previous work (Utsumi, 2018, 2020, Turton, Vinson & Smith, 2020) has shown that Binder features can be derived from static embeddings and successfully extrapolated to a large new vocabulary. Taking the next step, this paper demonstrates that Binder features can be derived from the BERT embedding space. This provides contextualised Binder embeddings, which can aid in understanding semantic differences between words in context. It additionally provides insights into how semantic features are represented across the different layers of the BERT model.