多声源2D定位中域适应域适应的鉴别器的合奏

论文标题

多声源2D定位中域适应域适应的鉴别器的合奏

Ensemble of Discriminators for Domain Adaptation in Multiple Sound Source 2D Localization

论文作者

Moing, Guillaume Le, Agravante, Don Joven, Inoue, Tadanobu, Vongkulbhisal, Jayakorn, Munawar, Asim, Tachibana, Ryuki, Vinayavekhin, Phongtharin

论文摘要

本文介绍了一组歧视者的合奏，该集合提高了域适应技术的精度，以定位多个声源。最近，深层神经网络为这项任务带来了令人鼓舞的结果，但它们需要大量标记的数据进行培训。录制和标记此类数据集的成本非常高昂，尤其是因为数据需要足够多样化才能涵盖不同的声学条件。在本文中，我们利用声学模拟器廉价地生成标记的训练样本。但是，由于域不匹配，对合成数据训练的模型往往会在现实世界中的记录效果不佳。为此，我们使用对源源的对抗性学习来探索两种域适应方法，用于声音源本地化，这些方法使用标记的合成数据和未标记的真实数据。我们提出了一种新颖的合奏方法，该方法结合了在本地化模型的不同特征级别上应用的歧视器。实验表明，我们的合奏歧视方法可显着提高本地化性能，而无需从真实数据中获得任何标签。

This paper introduces an ensemble of discriminators that improves the accuracy of a domain adaptation technique for the localization of multiple sound sources. Recently, deep neural networks have led to promising results for this task, yet they require a large amount of labeled data for training. Recording and labeling such datasets is very costly, especially because data needs to be diverse enough to cover different acoustic conditions. In this paper, we leverage acoustic simulators to inexpensively generate labeled training samples. However, models trained on synthetic data tend to perform poorly with real-world recordings due to the domain mismatch. For this, we explore two domain adaptation methods using adversarial learning for sound source localization which use labeled synthetic data and unlabeled real data. We propose a novel ensemble approach that combines discriminators applied at different feature levels of the localization model. Experiments show that our ensemble discrimination method significantly improves the localization performance without requiring any label from the real data.

下载PDF全文

下载文献需遵守相关版权规定

论文标题