论文标题

阿拉伯变压器模型的事后分析

Post-hoc analysis of Arabic transformer models

论文作者

Abdelali, Ahmed, Durrani, Nadir, Dalvi, Fahim, Sajjad, Hassan

论文摘要

阿拉伯语是一种闪族语言,与许多方言都广泛使用。鉴于预先训练的语言模型的成功,许多接受阿拉伯语训练的变压器模型及其方言浮出水面。尽管这些模型对下游NLP任务进行了外部评估,但尚未进行分析和比较其内部表示形式的工作。我们探讨了如何在变压器模型中编码语言信息,并在不同的阿拉伯方言中训练。我们使用形态标记任务对阿拉伯语和方言识别任务进行形态学标记任务对模型进行层和神经元分析。 Our analysis enlightens interesting findings such as: i) word morphology is learned at the lower and middle layers, ii) while syntactic dependencies are predominantly captured at the higher layers, iii) despite a large overlap in their vocabulary, the MSA-based models fail to capture the nuances of Arabic dialects, iv) we found that neurons in embedding layers are polysemous in nature, while the neurons in middle layers are特定属性独有

Arabic is a Semitic language which is widely spoken with many dialects. Given the success of pre-trained language models, many transformer models trained on Arabic and its dialects have surfaced. While there have been an extrinsic evaluation of these models with respect to downstream NLP tasks, no work has been carried out to analyze and compare their internal representations. We probe how linguistic information is encoded in the transformer models, trained on different Arabic dialects. We perform a layer and neuron analysis on the models using morphological tagging tasks for different dialects of Arabic and a dialectal identification task. Our analysis enlightens interesting findings such as: i) word morphology is learned at the lower and middle layers, ii) while syntactic dependencies are predominantly captured at the higher layers, iii) despite a large overlap in their vocabulary, the MSA-based models fail to capture the nuances of Arabic dialects, iv) we found that neurons in embedding layers are polysemous in nature, while the neurons in middle layers are exclusive to specific properties

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源