Kriston AI系统用于Voxceleb扬声器识别挑战2022

论文标题

Kriston AI系统用于Voxceleb扬声器识别挑战2022

The Kriston AI System for the VoxCeleb Speaker Recognition Challenge 2022

论文作者

Cai, Qutang, Hong, Guoqiang, Ye, Zhijian, Li, Ximin, Li, Haizhou

论文摘要

该技术报告介绍了我们针对Voxceleb识别挑战2022（VOXSRC-22）的轨道1、2和4的系统。通过结合多个重新网络变体，我们的曲目1提交的提交获得了0：090的MindCF，而EER 1：401％。通过进一步合并三个微调的预训练模型，我们对轨道2的提交使MindCF达到了0：072，EER 1：119％。对于轨道4，我们的系统包括语音活动检测（VAD），扬声器嵌入提取，聚集的分层聚类（AHC），然后是基于贝叶斯隐藏的Markov模型以及重叠的语音检测和处理的重新聚类步骤。我们对轨道4的提交达到的腹泻错误率（DER）为4.86％。这些提交都排名相应的轨道的第二名。

This technical report describes our system for track 1, 2 and 4 of the VoxCeleb Speaker Recognition Challenge 2022 (VoxSRC-22). By combining several ResNet variants, our submission for track 1 attained a minDCF of 0:090 with EER 1:401%. By further incorporating three fine-tuned pre-trained models, our submission for track 2 achieved a minDCF of 0:072 with EER 1:119%. For track 4, our system consisted of voice activity detection (VAD), speaker embedding extraction, agglomerative hierarchical clustering (AHC) followed by a re-clustering step based on a Bayesian hidden Markov model and overlapped speech detection and handling. Our submission for track 4 achieved a diarisation error rate (DER) of 4.86%. The submissions all ranked the 2nd places for the corresponding tracks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题