An Entropy Heuristic to Optimize Decision Diagrams for Index-driven Search in Biological Graph Databases

Nicola Licheri, Elvio Amparone, Vincenzo Bonnici, Rosalba Giugno, Marco Beccuti
2021 International Conference on Information and Knowledge Management  
Graphs are a widely used structure for knowledge representation. Their uses range from biochemical to biomedical applications and are recently involved in multi-omics analyses. A key computational task regarding graphs is the search of specific topologies contained in them. The task is known to be NP-complete, thus indexing techniques are applied for dealing with its complexity. In particular, techniques exploiting paths extracted from graphs have shown good performances in terms of time
more » ... ments, but they still suffer because of the relatively large size of the produced index. We applied decision diagrams (DDs) as index data structure showing a good reduction in the indexing size with respect to other approaches. Nevertheless, the size of a DD is dependent on its variable order. Because the search of an optimal order is an NP-complete task, variable order heuristics on DDs are applied by exploiting domain-specific information. Here, we propose a heuristic based on the information content of the labeled paths. Tests on well-studied biological benchmarks, which are an essential part of multi-omics graphs, show that the resultant size correlates with the information measure related to the paths and that the chosen order allows to effectively reduce the index size.
dblp:conf/cikm/LicheriABGB21 fatcat:wow7wwpiyzbzpo7i2bpxfs3sfm