论文标题
DP-PARSE:与实例词典中的原始语音查找单词边界
DP-Parse: Finding Word Boundaries from Raw Speech with an Instance Lexicon
论文作者
论文摘要
在连续语音中找到单词边界是具有挑战性的,因为单词之间几乎没有或根本没有“空间”定界符。流行的贝叶斯非参数模型用于文本细分的模型使用Dirichlet过程来共同分段句子并构建单词类型的词典。我们介绍了DP-Parse,该DP-Parse使用类似的原则,但仅依赖于单词令牌的实例词典,避免了单词类型词典出现的聚类错误。在零资源语音基准2017上,我们的模型以5种语言设置了新的语音细分。该算法单调地通过更好的输入表示可以改善,当用弱监督输入喂养时,得分却更高。尽管缺乏类型的词典,但DP-Parse仍可以将其管道插入到语言模型上,并学习通过新的口语嵌入基准评估的语义和句法表示。
Finding word boundaries in continuous speech is challenging as there is little or no equivalent of a 'space' delimiter between words. Popular Bayesian non-parametric models for text segmentation use a Dirichlet process to jointly segment sentences and build a lexicon of word types. We introduce DP-Parse, which uses similar principles but only relies on an instance lexicon of word tokens, avoiding the clustering errors that arise with a lexicon of word types. On the Zero Resource Speech Benchmark 2017, our model sets a new speech segmentation state-of-the-art in 5 languages. The algorithm monotonically improves with better input representations, achieving yet higher scores when fed with weakly supervised inputs. Despite lacking a type lexicon, DP-Parse can be pipelined to a language model and learn semantic and syntactic representations as assessed by a new spoken word embedding benchmark.