基准学习索引

论文标题

基准学习索引

Benchmarking Learned Indexes

论文作者

Marcus, Ryan, Kipf, Andreas, van Renen, Alexander, Stoian, Mihail, Misra, Sanchit, Kemper, Alfons, Neumann, Thomas, Kraska, Tim

论文摘要

学习指数结构的最新进步建议用近似学识渊博的模型代替现有的索引结构，例如B-Trees。在这项工作中，我们提出了一个统一的基准测试，该基准将三个学识渊博的指数结构与几个最先进的“传统”基线进行比较。使用四个现实世界数据集，我们证明了学到的索引结构确实可以在密集的数组上仅读取的仅内存中的内存工作负载中的非学习索引。我们还研究了缓存，管道，数据集大小和密钥大小的影响。我们研究了学习指数结构的性能概况，并为为什么学习模型实现如此出色的性能建立了解释。最后，我们研究了学到的索引结构的其他重要属性，例如它们在多线程系统中的性能及其构建时间。

Recent advancements in learned index structures propose replacing existing index structures, like B-Trees, with approximate learned models. In this work, we present a unified benchmark that compares well-tuned implementations of three learned index structures against several state-of-the-art "traditional" baselines. Using four real-world datasets, we demonstrate that learned index structures can indeed outperform non-learned indexes in read-only in-memory workloads over a dense array. We also investigate the impact of caching, pipelining, dataset size, and key size. We study the performance profile of learned index structures, and build an explanation for why learned models achieve such good performance. Finally, we investigate other important properties of learned index structures, such as their performance in multi-threaded systems and their build times.

下载PDF全文

下载文献需遵守相关版权规定

论文标题