论文标题
基于变压器的统一孟加拉多级情感语料库的文本分类
Transformer-based Text Classification on Unified Bangla Multi-class Emotion Corpus
论文作者
论文摘要
在这项研究中,我们提出了一套完整的方法,用于识别和从孟加拉文本中提取情绪。我们为六个课程提供了孟加拉的情感分类器:愤怒,厌恶,恐惧,喜悦,悲伤和惊喜,孟加拉语使用基于变压器的模型,最近几天表现出惊人的结果,尤其是对于高水库的语言。统一的孟加拉多级情感语料库(UBMEC)用于评估我们的模型的性能。 UBMEC是通过结合两个先前发布的关于六个情感课程的孟加拉语评论的手动标记的数据集以及我们创建的新鲜标记的孟加拉评论来创建的。我们在这项工作中使用的语料库数据集和代码公开可用。
In this research, we propose a complete set of approaches for identifying and extracting emotions from Bangla texts. We provide a Bangla emotion classifier for six classes: anger, disgust, fear, joy, sadness, and surprise, from Bangla words using transformer-based models, which exhibit phenomenal results in recent days, especially for high-resource languages. The Unified Bangla Multi-class Emotion Corpus (UBMEC) is used to assess the performance of our models. UBMEC is created by combining two previously released manually labeled datasets of Bangla comments on six emotion classes with fresh manually labeled Bangla comments created by us. The corpus dataset and code we used in this work are publicly available.