IIT Gandhinagar在Semeval-2020任务9：使用候选句子生成和选择的代码混合情感分类

论文标题

IIT Gandhinagar在Semeval-2020任务9：使用候选句子生成和选择的代码混合情感分类

IIT Gandhinagar at SemEval-2020 Task 9: Code-Mixed Sentiment Classification Using Candidate Sentence Generation and Selection

论文作者

Srivastava, Vivek, Singh, Mayank

论文摘要

混合代码是在文本或语音相同的话语中使用多种语言的现象。这是一种在各种平台上的通信模式，例如社交媒体网站，在线游戏，产品评论等。单语文本的情感分析是一项精心研究的任务。通过非标准的写作方式，代码混合增加了分析文本情感的挑战。我们在基于BISTM的神经分类器之上提出了一种候选句子生成和基于选择的方法，以将Hinglish代码混合的文本分类为三种情感类别之一的阳性，负面或中立。与基于BISTM的神经分类器相比，提出的方法显示了系统性能的改善。结果提供了一个机会，可以理解文本数据中代码混合的其他各种细微差别，例如幽默检测，意图分类等。

Code-mixing is the phenomenon of using multiple languages in the same utterance of a text or speech. It is a frequently used pattern of communication on various platforms such as social media sites, online gaming, product reviews, etc. Sentiment analysis of the monolingual text is a well-studied task. Code-mixing adds to the challenge of analyzing the sentiment of the text due to the non-standard writing style. We present a candidate sentence generation and selection based approach on top of the Bi-LSTM based neural classifier to classify the Hinglish code-mixed text into one of the three sentiment classes positive, negative, or neutral. The proposed approach shows an improvement in the system performance as compared to the Bi-LSTM based neural classifier. The results present an opportunity to understand various other nuances of code-mixing in the textual data, such as humor-detection, intent classification, etc.

下载PDF全文

下载文献需遵守相关版权规定

论文标题