自动评估在线健康文章的质量

论文标题

自动评估在线健康文章的质量

Automatically Assessing Quality of Online Health Articles

论文作者

Afsana, Fariha, Kabir, Muhammad Ashad, Hassan, Naeemul, Paul, Manoranjan

论文摘要

当今的信息生态系统被空前的有关多功能主题的数据所淹没，质量各不相同。但是，在医学领域传播的信息质量受到质疑，因为健康错误的负面影响可能会危及生命。目前尚无通用自动化工具来评估广泛范围内跨越的在线健康信息的质量。为了解决这一差距，在本文中，我们采用了一种数据挖掘方法来根据10个质量标准自动评估在线健康文章的质量。我们已经准备了具有53012功能的标签数据集，并应用了不同的功能选择方法，以识别我们训练的分类器实现84％-90％的最佳功能子集，超过10个标准。我们对特征的语义分析显示了所选特征与评估标准之间的基础关联，并进一步合理化了我们的评估方法。我们的发现将有助于确定高质量的健康文章，从而帮助用户塑造自己的意见，以做出正确的选择，同时从网上挑选与健康相关的帮助。

The information ecosystem today is overwhelmed by an unprecedented quantity of data on versatile topics are with varied quality. However, the quality of information disseminated in the field of medicine has been questioned as the negative health consequences of health misinformation can be life-threatening. There is currently no generic automated tool for evaluating the quality of online health information spanned over a broad range. To address this gap, in this paper, we applied a data mining approach to automatically assess the quality of online health articles based on 10 quality criteria. We have prepared a labeled dataset with 53012 features and applied different feature selection methods to identify the best feature subset with which our trained classifier achieved an accuracy of 84%-90% varied over 10 criteria. Our semantic analysis of features shows the underpinning associations between the selected features & assessment criteria and further rationalize our assessment approach. Our findings will help in identifying high-quality health articles and thus aiding users in shaping their opinion to make the right choice while picking health-related help from online.

下载PDF全文

下载文献需遵守相关版权规定

论文标题