论文标题
Kriston AI系统用于Voxceleb扬声器识别挑战2022
The Kriston AI System for the VoxCeleb Speaker Recognition Challenge 2022
论文作者
论文摘要
该技术报告介绍了我们针对Voxceleb识别挑战2022(VOXSRC-22)的轨道1、2和4的系统。通过结合多个重新网络变体,我们的曲目1提交的提交获得了0:090的MindCF,而EER 1:401%。通过进一步合并三个微调的预训练模型,我们对轨道2的提交使MindCF达到了0:072,EER 1:119%。对于轨道4,我们的系统包括语音活动检测(VAD),扬声器嵌入提取,聚集的分层聚类(AHC),然后是基于贝叶斯隐藏的Markov模型以及重叠的语音检测和处理的重新聚类步骤。我们对轨道4的提交达到的腹泻错误率(DER)为4.86%。这些提交都排名相应的轨道的第二名。
This technical report describes our system for track 1, 2 and 4 of the VoxCeleb Speaker Recognition Challenge 2022 (VoxSRC-22). By combining several ResNet variants, our submission for track 1 attained a minDCF of 0:090 with EER 1:401%. By further incorporating three fine-tuned pre-trained models, our submission for track 2 achieved a minDCF of 0:072 with EER 1:119%. For track 4, our system consisted of voice activity detection (VAD), speaker embedding extraction, agglomerative hierarchical clustering (AHC) followed by a re-clustering step based on a Bayesian hidden Markov model and overlapped speech detection and handling. Our submission for track 4 achieved a diarisation error rate (DER) of 4.86%. The submissions all ranked the 2nd places for the corresponding tracks.