发音波：声学到发出反演的自回归模型

论文标题

发音波：声学到发出反演的自回归模型

Articulatory-WaveNet: Autoregressive Model For Acoustic-to-Articulatory Inversion

论文作者

Bozorg, Narjes, Johnson, Michael T.

论文摘要

本文介绍了旋转波，这是一种新的声学到关节倒置的方法。拟议的系统使用波纳特语音综合体系结构，并使用以声学特征为条件的预测的关节轨迹的先前值使用扩张的因果卷积层。对该系统进行了培训和评估，并评估了普通话的电磁功能学语料库（EMA-MAE），由39位讲话者组成，其中包括以英语英语为母语的人和讲英语的母语母语者。结果表明，新方法的生成和真实的关节轨迹之间的相关性和RMSE都显着改善，平均相关性为0.83，比基线隐藏的Markov模型（HMM） - Gaussian Mixuture Mode（GMM）Inversion Inversion框架获得了0.61相关性的相对相对相对36％的相对改善。据我们所知，本文介绍了一种逐点波形合成方法在声学到发出的反演问题上的首次应用，并且与以前的依赖性声学与关节反演的方法相比，结果显示出的性能有所提高。

This paper presents Articulatory-WaveNet, a new approach for acoustic-to-articulator inversion. The proposed system uses the WaveNet speech synthesis architecture, with dilated causal convolutional layers using previous values of the predicted articulatory trajectories conditioned on acoustic features. The system was trained and evaluated on the ElectroMagnetic Articulography corpus of Mandarin Accented English (EMA-MAE),consisting of 39 speakers including both native English speakers and native Mandarin speakers speaking English. Results show significant improvement in both correlation and RMSE between the generated and true articulatory trajectories for the new method, with an average correlation of 0.83, representing a 36% relative improvement over the 0.61 correlation obtained with a baseline Hidden Markov Model (HMM)-Gaussian Mixture Model (GMM) inversion framework. To the best of our knowledge, this paper presents the first application of a point-by-point waveform synthesis approach to the problem of acoustic-to-articulatory inversion and the results show improved performance compared to previous methods for speaker dependent acoustic to articulatory inversion.

下载PDF全文

下载文献需遵守相关版权规定

论文标题