Fast Label Extraction in the CDAWG
2017
*
Lecture Notes in Computer Science
*

The compact directed acyclic word graph (CDAWG) of a string T of length n takes space proportional just to the number e of right extensions of the maximal repeats of T , and it is thus an appealing index for highly repetitive datasets, like collections of genomes from similar species, in which e grows significantly more slowly than n. We reduce from O(m log log n) to O(m) the time needed to count the number of occurrences of a pattern of length m, using an existing data structure that takes an

doi:10.1007/978-3-319-67428-5_14
fatcat:etggr4qzwbeqpeycmljdl6zqjq