Grammar-compressed Self-index with Lyndon Words [article]

Kazuya Tsuruta and Dominik Köppl and Yuto Nakashima and Shunsuke Inenaga and Hideo Bannai and Masayuki Takeda
2020 arXiv   pre-print
We introduce a new class of straight-line programs (SLPs), named the Lyndon SLP, inspired by the Lyndon trees (Barcelo, 1990). Based on this SLP, we propose a self-index data structure of O(g) words of space that can be built from a string T in O(n n) expected time, retrieving the starting positions of all occurrences of a pattern P of length m in O(m + m n + occ g) time, where n is the length of T, g is the size of the Lyndon SLP for T, and occ is the number of occurrences of P in T.
arXiv:2004.05309v2 fatcat:zpzktt64nbh4nfkhn6nbv25xem