论文标题
通过多模型融合来处理欺骗意识的扬声器验证
Tackling Spoofing-Aware Speaker Verification with Multi-Model Fusion
论文作者
论文摘要
近年来见证了自动扬声器验证的非凡发展(ASV)。但是,先前的作品表明,最新的ASV模型非常容易受到语音欺骗攻击的影响,而最近提出的高性能欺骗对策(CM)模型仅专注于独立的反欺骗任务,而忽略了后续的扬声器验证过程。如何将CM和ASV集成在一起仍然是一个悬而未决的问题。欺骗意识的扬声器验证(SASV)挑战最近发生了这样的论点,即当共同优化CM和ASV子系统时,可以提供更好的性能。在挑战的情况下,参与者提出的集成系统必须拒绝冒名顶替者的扬声器和欺骗目标扬声器的攻击,该攻击者直观有效地有效地与可靠,欺骗的ASV系统的期望相匹配。这项工作着重于基于融合的SASV解决方案,并提出了一个多模型融合框架,以利用多个最先进的ASV和CM模型的功能。拟议的框架将SASV-EER从8.75%提高到1.17 \%,与SASV挑战中最佳基线系统相比,相对相对改善为86%。
Recent years have witnessed the extraordinary development of automatic speaker verification (ASV). However, previous works show that state-of-the-art ASV models are seriously vulnerable to voice spoofing attacks, and the recently proposed high-performance spoofing countermeasure (CM) models only focus solely on the standalone anti-spoofing tasks, and ignore the subsequent speaker verification process. How to integrate the CM and ASV together remains an open question. A spoofing aware speaker verification (SASV) challenge has recently taken place with the argument that better performance can be delivered when both CM and ASV subsystems are optimized jointly. Under the challenge's scenario, the integrated systems proposed by the participants are required to reject both impostor speakers and spoofing attacks from target speakers, which intuitively and effectively matches the expectation of a reliable, spoofing-robust ASV system. This work focuses on fusion-based SASV solutions and proposes a multi-model fusion framework to leverage the power of multiple state-of-the-art ASV and CM models. The proposed framework vastly improves the SASV-EER from 8.75% to 1.17\%, which is 86% relative improvement compared to the best baseline system in the SASV challenge.