分销基础真相：UI标签任务中的非冗余众包数据质量控制

论文标题

分销基础真相：UI标签任务中的非冗余众包数据质量控制

Distributional Ground Truth: Non-Redundant Crowdsourcing Data Quality Control in UI Labeling Tasks

论文作者

Bakaev, Maxim, Heil, Sebastian, Gaedke, Martin

论文摘要

HCI越来越多地采用机器学习和图像识别，特别是用于对用户界面（UIS）的视觉分析。获得人体标记培训数据的一种流行方式是众包，通常使用质量控制方法基础真理和多数共识，这需要结果的冗余。在我们的论文中，我们提出了一种非冗余方法，用于根据用两样本Kolmogorov-Smirnov测试评估的分布的均质性来预测Web UI标签任务中人群的输出质量。使用约500个屏幕截图的数据集，其中超过74,000个UI元素由11个值得信赖的标签和298个Amazon Mechanical Turk Crowdworkers进行了分类，我们证明了基于平均时间按任务的基线模型的优势。在探索不同的数据集分区时，我们表明，凭借17-27％UIS的值得信赖的设定大小，我们的“分配基础真相”模型可以达到超过0.8的R2，并有助于消除辅助工作和费用。

HCI increasingly employs Machine Learning and Image Recognition, in particular for visual analysis of user interfaces (UIs). A popular way for obtaining human-labeled training data is Crowdsourcing, typically using the quality control methods ground truth and majority consensus, which necessitate redundancy in the outcome. In our paper we propose a non-redundant method for prediction of crowdworkers' output quality in web UI labeling tasks, based on homogeneity of distributions assessed with two-sample Kolmogorov-Smirnov test. Using a dataset of about 500 screenshots with over 74,000 UI elements located and classified by 11 trusted labelers and 298 Amazon Mechanical Turk crowdworkers, we demonstrate the advantage of our approach over the baseline model based on mean Time-on-Task. Exploring different dataset partitions, we show that with the trusted set size of 17-27% UIs our "distributional ground truth" model can achieve R2s of over 0.8 and help to obviate the ancillary work effort and expenses.

下载PDF全文

下载文献需遵守相关版权规定

论文标题