Compression of graphical structures

Yongwook Choi, Wojciech Szpankowski
2009 2009 IEEE International Symposium on Information Theory  
F. Brooks argues in [3] there is "no theory that gives us a metric for information embodied in structure". Shannon himself alluded to it fifty years earlier in his little known 1953 paper [14] . Indeed, in the past information theory dealt mostly with "conventional data", be it textual data, image or video data. However, databases of various sorts have come into existence in recent years for storing "unconventional data" including biological data, web data, topographical maps, and medical data.
more » ... , and medical data. In compressing such data structures, one must consider two types of information: the information conveyed by the structure itself, and then the information conveyed by the data labels implanted in the structure. In this paper, we attempt to address the former problem by studying information of graphical structures (i.e., unlabeled graphs). In particular, we consider Erdös-Rényi graphs G(n, p) over n vertices in which edges are added randomly with probability p. We prove that the structural entropy of G(n, p) is the entropy rate of a conventional memoryless binary source. Then, we design a twostage encoding that optimally compress unlabeled graphs up to the first two leading terms of the structural entropy.
doi:10.1109/isit.2009.5205736 dblp:conf/isit/ChoiS09 fatcat:lffj3pr4mfdp5mngrkuf4c2d5q