论文标题
回复和报价的立场(SRQ):在Twitter对话中学习立场的新数据集
Stance in Replies and Quotes (SRQ): A New Dataset For Learning Stance in Twitter Conversations
论文作者
论文摘要
从社交媒体上的对话中提取立场的自动化方法(拒绝与支持意见)对于推进意见挖掘研究至关重要。最近,该领域引起了新的兴奋,因为我们看到新的模型试图改善最先进的方式。但是,对于培训和评估模型,所使用的数据集通常很小。此外,这些小型数据集具有不平衡的类分布,即,数据集中只有一小部分示例具有偏爱或否认立场,而且大多数其他示例都没有明确的立场。此外,现有数据集并不能区分社交媒体上不同类型的对话(例如,在Twitter上回复与引用)。因此,在一个事件中训练的模型不会推广到其他事件。 在介绍的工作中,我们通过在有争议的问题上对Twitter(均为答复和报价)上的帖子的响应标记姿势来创建一个新的数据集。据我们所知,这是目前最大的人类标记的立场数据集,用于具有超过5200个立场标签的Twitter对话。更重要的是,我们设计了一种推文收集方法,该方法有利于选择拒绝型响应的选择。预计该课程将在识别谣言和确定用户之间的拮抗关系方面更有用。此外,我们包括许多基线模型,用于学习对话中的立场并比较各种模型的性能。我们表明,将回复和引号的数据结合起来降低了模型的准确性,表明这两种模式在立场学习方面的行为有所不同。
Automated ways to extract stance (denying vs. supporting opinions) from conversations on social media are essential to advance opinion mining research. Recently, there is a renewed excitement in the field as we see new models attempting to improve the state-of-the-art. However, for training and evaluating the models, the datasets used are often small. Additionally, these small datasets have uneven class distributions, i.e., only a tiny fraction of the examples in the dataset have favoring or denying stances, and most other examples have no clear stance. Moreover, the existing datasets do not distinguish between the different types of conversations on social media (e.g., replying vs. quoting on Twitter). Because of this, models trained on one event do not generalize to other events. In the presented work, we create a new dataset by labeling stance in responses to posts on Twitter (both replies and quotes) on controversial issues. To the best of our knowledge, this is currently the largest human-labeled stance dataset for Twitter conversations with over 5200 stance labels. More importantly, we designed a tweet collection methodology that favors the selection of denial-type responses. This class is expected to be more useful in the identification of rumors and determining antagonistic relationships between users. Moreover, we include many baseline models for learning the stance in conversations and compare the performance of various models. We show that combining data from replies and quotes decreases the accuracy of models indicating that the two modalities behave differently when it comes to stance learning.