Exploiting uniqueness in query optimization

G. N. Paulley, Per-Åke Larson
2010 CASCON First Decade High Impact Papers on - CASCON '10  
Functional dependency analysis can be applied to various problems in query optimization: selectivity estimation, estimation of (intermediate) result sizes, order optimization (in particular sort avoidance), cost estimation, and various problems in the area of semantic query optimization. Dependency analysis in an ansi sql relational model, however, is made complex due to the existence of null values, three-valued logic, outer joins, and duplicate rows. In this thesis we define the notions of
more » ... ict and lax functional dependencies, strict and lax equivalence constraints, and null constraints, which capture both a large set of the constraints implied by ansi sql query expressions, including outer joins, and a useful set of declarative constraints for ansi sql base tables, including unique, table, and referential integrity constraints. We develop and prove a sound set of inference axioms for this set of combined constraints, and formalize the set of constraints that hold in the result of each sql algebraic operator. We define an extended functional dependency graph model (fd-graph) to represent these constraints, and present and prove correct a detailed polynomial-time algorithm to maintain this fd-graph for each algebraic operator. We illustrate the utility of this analysis with examples and additional theoretical results from two problem domains in query optimization: query rewrite optimizations that exploit uniqueness properties, and order optimization that exploits both functional dependencies and attribute equivalence. We show that the theory behind these two applications of dependency analysis is not only useful in relational database systems, but in non-relational database environments as well.
doi:10.1145/1925805.1925812 fatcat:fl3ei33mbjf7dgt7m7y3tr4kre