论文标题

使用X矢量嵌入的病理语音检测

Pathological speech detection using x-vector embeddings

论文作者

Botelho, Catarina, Teixeira, Francisco, Rolland, Thomas, Abad, Alberto, Trancoso, Isabel

论文摘要

对于身体和心理状况,多个作品的结果反复支持语音作为评估说话者健康的非侵入性生物标志物的潜力。传统的基于语音疾病分类的系统集中在精心设计的基于知识的特征上。但是,这些特征可能不能代表该疾病的全部症状学,甚至可能忽略了其更微妙的表现。这促使研究人员朝着通用扬声器表示的方向前进,这些方向固有地模拟了诸如高斯主管,I-exectors和x-vectors之类的症状。在这项工作中,我们专注于后者,以评估它们作为检测帕金森氏病(PD)和阻塞性睡眠呼吸暂停(OSA)的一般特征提取方法的适用性。我们测试了针对基于知识的功能和I-向量的方法,并为OSA和PD报告了两个欧洲葡萄牙语料库的结果,以及PD的其他西班牙语料库。 X-Vector和I-Vector模型均经过欧洲室外葡萄牙语料库的培训。我们的结果表明,X-向量能够在同一语言中的基于知识的功能更好。此外,虽然在匹配条件下与I-Vector相似的X向量执行,但在发生域不匹配时,它们的表现明显优于它们。

The potential of speech as a non-invasive biomarker to assess a speaker's health has been repeatedly supported by the results of multiple works, for both physical and psychological conditions. Traditional systems for speech-based disease classification have focused on carefully designed knowledge-based features. However, these features may not represent the disease's full symptomatology, and may even overlook its more subtle manifestations. This has prompted researchers to move in the direction of general speaker representations that inherently model symptoms, such as Gaussian Supervectors, i-vectors and, x-vectors. In this work, we focus on the latter, to assess their applicability as a general feature extraction method to the detection of Parkinson's disease (PD) and obstructive sleep apnea (OSA). We test our approach against knowledge-based features and i-vectors, and report results for two European Portuguese corpora, for OSA and PD, as well as for an additional Spanish corpus for PD. Both x-vector and i-vector models were trained with an out-of-domain European Portuguese corpus. Our results show that x-vectors are able to perform better than knowledge-based features in same-language corpora. Moreover, while x-vectors performed similarly to i-vectors in matched conditions, they significantly outperform them when domain-mismatch occurs.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源