On saying "Enough already!" in SQL

Michael J. Carey, Donald Kossmann
1997 Proceedings of the 1997 ACM SIGMOD international conference on Management of data - SIGMOD '97  
In this paper, we study a simple SQL extension that enables query writers to explicitly limit the cardinality of a query result. We examine its impact on the query optimization and run-time execution components of a relational DBMS, presenting two approaches-a Conservative approach and an Aggressive approach-to exploiting cardinality limits in relational query plans. Results obtained from an empirical study conducted using DB2 demonstrate the benefits of the SQL extension and illustrate the
more » ... illustrate the tradeoffs between our two approaches to implementing it. © tuples in descending (ascending) order. If the sort directive is none, the Stop operator simply returns the first © tuples from its input stream; the none option is chosen by the optimizer when the Stop operator's input stream is known to already be appropriately sorted. The third parameter to Stop is a Sort Expression. If the sort directive is desc or asc, the Stop operator sorts its input according to this sort expression, which is usually identical to the ordering expression from the ORDER BY clause of the query. Like other logical operators (e.g., Join), the Stop operator can have more than one physical operator that is capable of implementing it in a query plan. Clearly, the implementation of the Stop operator should at least be dependent on its sort directive. Accordingly, we define two different physical Stop operators here: Scan-Stop, for when the sort directive is none, and Sort-Stop, for when the sort directive is desc or asc. We now discuss a possible implementation and a cost model (for the optimizer's use) for each one. Scan-Stop The Scan-Stop operator is extremely simple. Scan-Stop is a pipelined operator that simply requests and then passes each of the first © tuples of its input stream on to its consumer (i.e., to the operator above it in the query plan), after which it closes down its input stream and returns an end-of-stream indicator to its consumer. As a result, the cost of the Scan-Stop operator itself is negligible, and the total cost of a query subplan rooted at a Scan-Stop operator is dominated by the cost required to produce the first © tuples of its input stream. In a state-of-the-art relational DBMS, the query optimizer's cost model provides estimates for the total cardinality of a plan's output (ALL), the cost to produce the first tuple of a plan's output (cost ), and the cost to produce ALL output tuples (cost A LL)). Given estimates for these quantities for the subplan that generates the input stream for the Scan-Stop operator, the optimizer can estimate the cost, cost ©
doi:10.1145/253260.253302 dblp:conf/sigmod/CareyK97 fatcat:6qdrdihotnfizkjowtdmeu3wyu