论文标题

弦吸引子和无限单词

String Attractors and Infinite Words

论文作者

Restivo, Antonio, Romana, Giuseppe, Sciortino, Marinella

论文摘要

在数据压缩的背景下,在[Kempa和Prezza,2018]中引入了弦吸引子的概念,它代表了一个有限词的一组位置,其中所有因素都可以“吸引”。用于有限词的字符串吸引子的最小尺寸$γ^*$是与最常见的压缩方案相关的多种重复度量的下限,包括基于BWT和基于LZ的压缩机。在[Mantaci等,2021]中研究了该度量$γ^*$的组合特性。最近,通过评估每个前缀上的$γ^*$,为无限单词引入了一种称为字符串吸引子配置文件功能的复杂度度量。已经研究了这种措施的自动序列和线性复发的无限单词[Schaeffer and Shallit,2021]。在本文中,我们研究了这种复杂性度量与其他众所周知的组合概念之间的关系,这些概念与无限单词的背景下(例如因子复杂性和复发性)之间的关系。此外,我们引入了新的基于弦吸引子的复杂性度量,其中考虑了无限单词前缀的弦吸引子中位置的结构和分布。我们表明,这种措施为一些无限的单词家庭提供了更好的分类。

The notion of string attractor has been introduced in [Kempa and Prezza, 2018] in the context of Data Compression and it represents a set of positions of a finite word in which all of its factors can be "attracted". The smallest size $γ^*$ of a string attractor for a finite word is a lower bound for several repetitiveness measures associated with the most common compression schemes, including BWT-based and LZ-based compressors. The combinatorial properties of the measure $γ^*$ have been studied in [Mantaci et al., 2021]. Very recently, a complexity measure, called string attractor profile function, has been introduced for infinite words, by evaluating $γ^*$ on each prefix. Such a measure has been studied for automatic sequences and linearly recurrent infinite words [Schaeffer and Shallit, 2021]. In this paper, we study the relationship between such a complexity measure and other well-known combinatorial notions related to repetitiveness in the context of infinite words, such as the factor complexity and the recurrence. Furthermore, we introduce new string attractor-based complexity measures, in which the structure and the distribution of positions in a string attractor of the prefixes of infinite words are considered. We show that such measures provide a finer classification of some infinite families of words.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源