解决算法（MIS）信息分类中的偶然性：迈向负责任的机器学习议程

论文标题

解决算法（MIS）信息分类中的偶然性：迈向负责任的机器学习议程

Addressing contingency in algorithmic (mis)information classification: Toward a responsible machine learning agenda

论文作者

Hernández, Andrés Domínguez, Owen, Richard, Nielsen, Dan Saattrup, McConville, Ryan

论文摘要

机器学习（ML）启用分类模型越来越流行，以应对在线错误信息的庞大数量和速度以及其他可以被确定为有害的内容的速度。在构建这些模型时，数据科学家需要对``真实''来源的合法性，权威性和客观性``真相''用于模型培训和测试的来源。这具有政治，道德和认知的影响，在技术论文中很少解决这些含义，这些含义在技术论文中很少解决。尽管（并导致）报告了高准确性和绩效，并构成了ML驱动的，并在线构成了跨越的型号，并在线上构成了越来越多的影响，并在线上遇到了越来越多的影响。并通过协作性民族志和理论上的见解来加强科学和专业知识的理论见解，对（MIS）信息分类的ML模型进行了批判性分析：我们确定了一系列算法的突发事件 - 在模型开发过程中，可以通过不同反身和负责任的ML工具，用于在线调节错误信息和其他有害内容。

Machine learning (ML) enabled classification models are becoming increasingly popular for tackling the sheer volume and speed of online misinformation and other content that could be identified as harmful. In building these models, data scientists need to take a stance on the legitimacy, authoritativeness and objectivity of the sources of ``truth" used for model training and testing. This has political, ethical and epistemic implications which are rarely addressed in technical papers. Despite (and due to) their reported high accuracy and performance, ML-driven moderation systems have the potential to shape online public debate and create downstream negative impacts such as undue censorship and the reinforcing of false beliefs. Using collaborative ethnography and theoretical insights from social studies of science and expertise, we offer a critical analysis of the process of building ML models for (mis)information classification: we identify a series of algorithmic contingencies--key moments during model development that could lead to different future outcomes, uncertainty and harmful effects as these tools are deployed by social media platforms. We conclude by offering a tentative path toward reflexive and responsible development of ML tools for moderating misinformation and other harmful content online.

下载PDF全文

下载文献需遵守相关版权规定

论文标题