Granularity of Locks and Degrees of Consistency
Advanced Topics in Computer Systems |
|
Joe Hellerstein & Timothy Roscoe |
|
SQL Query Optimization
Basics
Given: A query joining n tables
The "Plan Space": Huge number of alternative, semantically
equivalent
plans.
The Perils of Error: Running time of plans can vary by many
orders of magnitude
Ideal Goal: Map a declarative query to the most efficient plan
tree.
Conventional Wisdom: You’re OK if you avoid the rotten plans.
Industrial State of the art: Most optimizers use System R
technique and work
"OK" up to about 10 joins.
Wide variability in handling complex queries with aggregation,
subqueries, etc. Wait for rewrite paper.
Approach 1: The Optimization Oracle
(definitely not to be confused with the company of the same
name)
You’d like to get the following information, but in 0 time:
- Consider each possible plan in turn.
- Run it & measure performance.
- The one that was fastest is the keeper.
Approach 2: Make Up a Heuristic & See if it Works
University INGRES
- always use NL-Join (indexed inner whenever possible)
- order relations from smallest to biggest
Oracle 6
- "Syntax-based" optimization
OK,OK. Approach 3: Think!
Three issues:
- define plan space to search
- do cost estimation for plans
- find an efficient algorithm to search through plan space for
"cheapest" plan
Selinger & the System R crowd the first to do this right. The
Bible of Query
Optimization.
SQL Refresher
SELECT {DISTINCT} <list of columns>
FROM <list of tables>
{WHERE <list of "Boolean Factors" (predicates in CNF)>}
{GROUP BY <list of columns>
{HAVING <list of Boolean Factors>}}
{ORDER BY <list of columns>};
Semantics are:
- take Cartesian product (a/k/a cross-product) of tables in FROM
clause, projecting to only those columns that appear in other clauses
- if there’s a WHERE clause, apply all filters in it
- if there’s a GROUP BY clause, form groups on the result
- if there’s a HAVING clause, filter groups with it
- if there’s an ORDER BY clause, make sure output is in the right
order
- if there’s a DISTINCT modifier, remove dups
Of course the plans don’t do this exactly; query optimization
interleaves 1 &
2 into a plan tree. GROUP BY, HAVING, DISTINCT and ORDER BY are applied
at the end, pretty
much in that order.
Plan Space
All your favorite query processing algorithms:
- sequential & index (clustered/unclustered) scans
- NL-join, (sort)-merge join, hash join
- sorting & hash-based grouping
- plan flows in a non-blocking fashion with get-next iterators
Logical equivalences:
- Joins are commutative and associative (reorderable) (modulo
avoiding Cartesian products)
- Selection and projection can be "pushed" below joins (modulo
relevant info being preserved)
Note some assumptions folded in here:
- selections are always"pushed down"
- projections are always "pushed down"
- all this is only for single query blocks
Some other popular assumptions (System R)
- only left-deep plan trees
- avoid Cartesian products
Cost Estimation
The soft underbelly of query optimization.
Requires:
- estimation of costs for each operator based on input cardinalities
- both I/O & CPU costs to some degree of accuracy
- estimation of predicate selectivities to compute cardinalities
for next step
- assumption of independence across predicates
- decidedly an inexact science.
- "Selingerian" estimation (no longer state of the art, but not
really so far off.)
- # of tuples & pages
- # of values per column (only for indexed columns)
- These estimations done periodically (why not all the time?)
- back of envelope calculations: CPU cost is based on # of RSS
calls, no distinction between Random and Sequential IO
- when you can’t estimate, use the "wet finger" technique
- New alternative approaches:
- Sampling: so far only concrete results for base relations
- Histograms: Common in industry, much interesting research.
- Sketches: randomized "synopses" that can be used for estimation.
- Note connection to approximate query processing
Searching the Plan Space
- Exhaustive search
- Dynamic Programming (prunes useless subtrees): System R
- Top-down, transformative version of DP: Volcano, Cascades (used
in MS SQL Server?)
- Randomized search algorithms (e.g. Ioannidis & Kang)
- Job Scheduling techniques
The System R Optimizer’s Search Algorithm
Look only at left-deep plans: there are n! plans (not
factoring in choice of
join method)
Observation: many of those plans share common prefixes, so don’t
enumerate all
of them
Sounds like a job for … Dynamic Programming!
- Find all plans for accessing each base relation
- Include index scans when available on "SARGable" predicates
- For each relation, save cheapest unordered plan, and cheapest
plan for each "interesting order". Discard all others.
- Now, try all ways of joining all pairs of 1-table plans saved so
far. Save cheapest unordered 2-table plans, and cheapest "interesting
ordered" 2-table plans.
- note: secondary join predicates are just like selections that
can’t be pushed down
- Now try all ways of combining a 2-table plan with a 1-table plan.
Save cheapest unordered and interestingly ordered 3-way plans. You can
now throw away the 2-way plans.
- Continue combining k-way and 1-way plans until you have
a collection of full plan trees
- At top, satisfy GROUP BY and ORDER BY either by using
interestingly ordered plan, or by adding a sort node to unordered plan,
whichever is cheapest.
Some additional details:
- don’t combine a k-way plan with a 1-way plan if there’s
no predicate between them, unless all predicates have been used up
(i.e. postpone Cartesian products)
- Cost of sort-merge join takes existing order of inputs into
account.
Evaluation:
- Only brings complexity down to about n2n-1,
and you store Choose(n, n/2) plans
- But no-Cartesian-products rule can make a big difference for some
queries.
- For worst queries, DP dies at 10-15 joins
- adding parameters to the search space makes things worse (e.g.
expensive predicates, distribution, parallelism, etc.)
Simple variations to improve plan quality:
- bushy trees: (2(n-1))! / (n-1)! plans, DP complexity is 3n - 2n
+ 1 + n + 1 need to store 2n plans
(actually it’s worse for subtle reasons)
- consider cross products: maximizes optimization time
Subqueries 101: Selinger does a very complete job with the
basics.
- subqueries optimized separately
- uncorrelated vs. correlated subqueries
- uncorrelated subqueries are basically constants to be computed
once in execution
- correlated subqueries are like function calls
Extensible Query Optimization
In the 1980's and 1990's there was a lot of work on alternatives and
extensions to the relational model (especially object-oriented DBs,
later XML). In the 90's and 2000's, there is work on dataflow
engines for networking (Click, P2) that look a lot like query
processors.
Challenge: design an extensible query optimizer that can be used for
non-relational contexts.
We read the Volcano paper both because it is one of the 2 main examples
of this work (the other being the optional paper on Starburst), and
because it presents an alternative search strategy to Selinger.
The Volcano style is used in MS SQL Server (where Graefe now works),
though it was based on the successor to Volcano, called Cascades.
Core ideas:
- Constraint: Assume we
have dataflow trees (it's
interesting to think about extending to multi-output DAGs and cyclic
plans!)
- Extensible Search Space: Provide
a mechanism to describe legal trees based on user-defined rules. Essentially these
rules describe a "language" of op-code-based dataflow scripts that the
query executor can understand. So these rules are essentially
grammars.
- Extensible dataflow properties: We
need to capture the properties of the output of a "subquery" (or
"subtree"
of the dataflow) in order to ensure we construct a semantically correct
plan. I.e. need to describe in logical terms what the stream of
data at the root should be. Since we want this to be extensible,
we can't predetermine these properties, so allow for opaque
user-defined Abstract Data Types.
The rules and property ADTs must be written to construct and maintain
these properties correctly.
- Extensible cost estimation: There needs to be an
extensible way (based on ADTs) to provide cost information to
compare the quality of two plans. Preferably this should not be a
single number; costs can be multidimensional (e.g. response time and
resource consumption.)
- Search algorithm: Given
the above, the extensible optimizer should be able to search through
the space of equivalent queries that are supported by the rules.
Dynamic programming is used to do this more efficiently than raw
enumeration. Two variants on dynamic programming:
- Starburst does this by using the rules constructively (a.k.a bottom-up or forward-chaining search) a la
Selinger, to build increasingly larger plans.
- Volcano does this with goal-directed search (a.k.a. top-down or backward-chaining search), starting
with the plan output, and recursing on all possible subtrees.
Some details:
- Useful to separate logical properties
(e.g. the relational description of a subresult including subquery and
schema), physical properties
(e.g. the order or partitioning of data in the stream), and estimated properties (e.g. the
expected number of tuples or data distribution of values.)
Volcano conflates logical and estimated.
- There's also an assumption that one has a set of logical operators (e.g. relational
algebra) and implementation
or physical operators (e.g.
hashjoin). Both techniques accomodate physical operators with no
logical equivalents (particularly sort,
but also things like distribution of tuples across nodes).
- In Volcano terminology, transformation
rules specify axioms for logical equivalences: e.g.
commutativity and associativity of joins. Implementation rules talk about how
to map logical operators to physical.
- Physical properties are encapsulated in an ADT. Special
operators called enforcers
(or glue in Starburst) may be
needed to do physical mappings on the data streams. A detail in
Volcano is that to achieve a subgoal with a particular physical
property (e.g. a sort order), you can either
- Find an implementation of a logical operator that produces that
physical property (e.g. sort-merge join), or
- Choose any implementation of a logical operator (e.g.
hash-join), and then use an enforcer (e.g. sort)
But be careful to avoid doing both (exluding
physical property vector) (Note: this issue doesn't come up in
the Starburst bottom-up approach.)
- Checking for properties is done with an applicability
function for the algorithm or enforcer.
Given these building blocks, Volcano works "top-down" as follows. It
will maintain a hashtable of existing subplans (hashed on
physical/logical properties) to memoize previous work.
- Consider the "top" of the query plan tree. We know the logical
and physical properties from the query specification. Assume
that a budget limit on the costs may be given to prune
search. Use our rules and properties to recurse as follows:
- if the desired output is in the hashtable with cost below the
limit, us it.
- else generate all subgoals based on:
- applicable transformations
- algorithms that give the required physical properties
- enforcers that give the required physical properties
- Graefe speaks of ordering these by their "promise", which isn't
described in detail.
- Apply the "move", update the resulting physical and logical
properties and costs. Add result to the hashtable. If cost is
still within budget, recurse (on root, or in the case of an
algorithm on its inputs).
Some notes on this "top-down" exploration:
- Budget limits can be very useful. For example, if you can guess
the optimal budget limit, Volcano could radically prune the plan
space.
- Note that once any complete plan has been explored by the
optimizer, its cost can be used as a budget limit.
- The promise ordering of subgoals is critical to making this
work well! If promise is computed as the sum of cost-so-far and
expected-cost-of-subtree, this is essentially what's
called A*-search in the AI literature, except that A* will quit
when it finds any complete plan. Volcano will explore all plans, but
can use the complete plan's cost to prune.
- Note that the generation of all subgoals at each step isn't
necessary. In the Cascades optimizer that was the follow-on to
Volcano, this expansion was postponed and done on demand (as part of
the "promise" ordering).
Because of this pruning, top-down can look at less of the plan space
than bottom-up if it gets lucky.
Parting thoughts:
- Query optimization is a traditional "search" problem, and should be
broadly applicable to dataflow systems in general.
- Optimization does not rely on a declarative language. As
seen in Volcano, it can be driven off transformation rules on
dataflow operators. In fact, Cascades removes the hard distinction
between logical and physical operators and
transformations/implementation.
- Dataflow is becoming popular again outside the context of
databases:
witness Click, P2, and
Google's Map/Reduce. These
systems could all use an extensible optimizer infrastructure!