DP-PARSE：与实例词典中的原始语音查找单词边界

论文标题

DP-PARSE：与实例词典中的原始语音查找单词边界

DP-Parse: Finding Word Boundaries from Raw Speech with an Instance Lexicon

论文作者

Algayres, Robin, Ricoul, Tristan, Karadayi, Julien, Laurençon, Hugo, Zaiem, Salah, Mohamed, Abdelrahman, Sagot, Benoît, Dupoux, Emmanuel

论文摘要

在连续语音中找到单词边界是具有挑战性的，因为单词之间几乎没有或根本没有“空间”定界符。流行的贝叶斯非参数模型用于文本细分的模型使用Dirichlet过程来共同分段句子并构建单词类型的词典。我们介绍了DP-Parse，该DP-Parse使用类似的原则，但仅依赖于单词令牌的实例词典，避免了单词类型词典出现的聚类错误。在零资源语音基准2017上，我们的模型以5种语言设置了新的语音细分。该算法单调地通过更好的输入表示可以改善，当用弱监督输入喂养时，得分却更高。尽管缺乏类型的词典，但DP-Parse仍可以将其管道插入到语言模型上，并学习通过新的口语嵌入基准评估的语义和句法表示。

Finding word boundaries in continuous speech is challenging as there is little or no equivalent of a 'space' delimiter between words. Popular Bayesian non-parametric models for text segmentation use a Dirichlet process to jointly segment sentences and build a lexicon of word types. We introduce DP-Parse, which uses similar principles but only relies on an instance lexicon of word tokens, avoiding the clustering errors that arise with a lexicon of word types. On the Zero Resource Speech Benchmark 2017, our model sets a new speech segmentation state-of-the-art in 5 languages. The algorithm monotonically improves with better input representations, achieving yet higher scores when fed with weakly supervised inputs. Despite lacking a type lexicon, DP-Parse can be pipelined to a language model and learn semantic and syntactic representations as assessed by a new spoken word embedding benchmark.

下载PDF全文

下载文献需遵守相关版权规定

论文标题