A Space-Optimal Grammar Compression

Yoshimasa Takabatake, Tomohiro I, Hiroshi Sakamoto, Marc Herbstritt
2017 European Symposium on Algorithms  
A grammar compression is a context-free grammar (CFG) deriving a single string deterministically. For an input string of length N over an alphabet of size σ, the smallest CFG is O(lg N )approximable in the offline setting and O(lg N lg * N )-approximable in the online setting. In addition, an information-theoretic lower bound for representing a CFG in Chomsky normal form of n variables is lg(n!/n σ ) + n + o(n) bits. Although there is an online grammar compression algorithm that directly
more » ... s the succinct encoding of its output CFG with O(lg N lg * N ) approximation guarantee, the problem of optimizing its working space has remained open. We propose a fully-online algorithm that requires the fewest bits of working space asymptotically equal to the lower bound in O(N lg lg n) compression time. In addition we propose several techniques to boost grammar compression and show their efficiency by computational experiments.
doi:10.4230/lipics.esa.2017.67 dblp:conf/esa/TakabatakeIS17 fatcat:chxvgfhnyvalnlzoigvtma7dvu