目标感知音频质量评估的数据驱动的认知显着性模型

论文标题

目标感知音频质量评估的数据驱动的认知显着性模型

A Data-driven Cognitive Salience Model for Objective Perceptual Audio Quality Assessment

论文作者

Delgado, Pablo M., Herre, Jürgen

论文摘要

客观音频质量测量系统通常使用感知模型来预测处理信号的主观质量评分，如听力测试中的报道。大多数系统将感知降解的不同指标映射为一个预测主观质量的单个质量评分。这需要一个质量映射阶段，该阶段通过使用统计学习（即数据驱动的方法）来告知，以变形指标为输入功能。但是，实践中可靠的培训数据的量受到限制，通常不足以全面培训大型学习模型。但是，目标系统中认知效应的模型可以改善学习模型。具体来说，考虑到某些失真类型的显着性，它们为映射阶段提供了其他功能，以改善学习过程，尤其是对于有限的培训数据。我们提出了一个新型的数据驱动的显着性模型，该模型通过使用显着度量来明确估计认知/降解度量相互作用来告知质量映射阶段。结合新型显着性模型的系统表现出优于仅使用统计学习将认知和降解指标以及其他众所周知的测量系统组合到代表性验证数据集的同等系统。

Objective audio quality measurement systems often use perceptual models to predict the subjective quality scores of processed signals, as reported in listening tests. Most systems map different metrics of perceived degradation into a single quality score predicting subjective quality. This requires a quality mapping stage that is informed by real listening test data using statistical learning (i.e., a data-driven approach) with distortion metrics as input features. However, the amount of reliable training data is limited in practice, and usually not sufficient for a comprehensive training of large learning models. Models of cognitive effects in objective systems can, however, improve the learning model. Specifically, considering the salience of certain distortion types, they provide additional features to the mapping stage that improve the learning process, especially for limited amounts of training data. We propose a novel data-driven salience model that informs the quality mapping stage by explicitly estimating the cognitive/degradation metric interactions using a salience measure. Systems incorporating the novel salience model are shown to outperform equivalent systems that only use statistical learning to combine cognitive and degradation metrics, as well as other well-known measurement systems, for a representative validation dataset.

下载PDF全文

下载文献需遵守相关版权规定

论文标题