Counting suffix arrays and strings

Klaus-Bernd Schürmann, Jens Stoye
2008 Theoretical Computer Science  
Suffix arrays are used in various applications and research areas like data compression or computational biology. In this work, our goal is to characterise the combinatorial properties of suffix arrays and their enumeration. For a fixed alphabet size and string length, we divide the set of all strings into equivalence classes of strings that share the same suffix array. For each such equivalence class, we count the number of strings contained in it. We also give exact formulas for computing the
more » ... number of equivalence classes. Our methods yield a lower bound for the compressibility of suffix arrays and build the foundation for the efficient generation of appropriate test data sets for suffix array based algorithms. We also show that summing up the elements of all equivalence classes forms a particular instance for some summation identities of Eulerian numbers.
doi:10.1016/j.tcs.2008.01.011 fatcat:tqb7fk5inrgpfjdhd6uibqkfb4