Observing SQL queries in their natural habitat

Torsten Grust, Jan Rittinger
2013 ACM Transactions on Database Systems  
We describe Habitat, a declarative observational debugger for SQL. Habitat facilitates true language-level (not: plan-level) debugging of, probably flawed, SQL queries that yield unexpected results. Users mark SQL subexpressions of arbitrary size and then observe whether these evaluate as expected. Habitat understands query nesting and free row variables in correlated subqueries, and generally aims to not constrain users while suspect subexpressions are marked for observation. From the marked
more » ... L text, Habitat's algebraic compiler derives a new query whose result represents the values of the desired observations. These observations are generated by the target SQL database host itself and are derived from the original data: Habitat does not require prior data extraction or extra debugging middleware. Experiments with TPC-H database instances indicate that observations impose a runtime overhead sufficiently low to allow for interactive debugging sessions. We discuss the design and internals of the observational SQL debugger HABITAT that helps users to identify errors, or "bugs", buried in queries. In particular, we pursue debugging of logical flaws that lead SQL queries to yield unexpected results or even runtime errors. We do not consider query engine or performance debugging here. A correct program is built from correct pieces. Observational debugging [Silva 2011; Pope and Naish 2003; Marlow et al. 2007 ] builds on this basic insight and promotes a debugging paradigm in which users observe program pieces to validate the elementary assumption that the pieces evaluate as expected. In this work, we apply the principles of observational debugging to SQL. Here, the pieces of a SQL query are its subqueries (if any) and, at a considerably finer granularity, its individual subexpressions. Ob-This is a preliminary release of an article accepted by ACM Transactions on Database Systems. The definitive version is currently in production at ACM and, when released, will supersede this version. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies show this notice on the first page or initial screen of a display along with the full citation. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this work in other works requires prior specific permission and/or a fee. Permissions may be requested from Publications
doi:10.1145/2445583.2445586 fatcat:6hz4d6wc4rb6fg62basq6l23eq