Requirements of phylogenetic databases

L. Nakhleh, D. Miranker, F. Barbancon
Third IEEE Symposium on Bioinformatics and Bioengineering, 2003. Proceedings.  
We examine the organizational impact on phylogenetic databases of the increasing sophistication in the need and use of phylogenetic data. A primary issue is the use of the unnormalized representation of phylogenies in Newick format as a primitive data type in existing phylogenetic databases. In particular, we identify and enumerate a list of potential applications of such databases and queries (use-cases) that biologists may wish to see integrated into a phylogenetic database management system.
more » ... We show there are many queries that would best be supported by a normalized data model where phylogenies are stored as lists of edges. Since many of the queries require transitive traversals of the phylogenies we demonstrate, constructively, that complex phylogenetic queries can be conveniently constructed as Datalog programs. We address concerns with respect to the cost and performance of the normalized representation by developing and empirically evaluating a feasibility prototype.
doi:10.1109/bibe.2003.1188940 dblp:conf/bibe/NakhlehMBPD03 fatcat:krakyprgzrd43ds5w4qjotkpqq