Mal Optimizers

One of the prime reasons to design the MAL intermediate language is to have a high-level description for database queries, which is easy to generate by a front-end compiler and easy to decode, optimize and interpret. In this section, we introduce the collection of MAL optimizers included in the codebase. The tool kit is incrementally built, triggered by experimentation and curiosity. Several optimizers require further development to cope with the many features making up the MonetDB system. Such limitations on the implementation are indicated where appropriate.

Experience shows that construction and debugging of a front-end specific optimizer is simplified when you retain information on the origin of the MAL code produced as long as possible. For example, the snippet sql.insert(col, 12@0, "hello") can be the target of simple SQL rewrites using the module name as the discriminator.

Pipeline validation. The pipelines used to optimize MAL programs contain dependencies. For example, it does not make much sense to call the dead-code optimizer too early in the pipeline, although it is not an error. Moreover, some optimizers are merely examples of the direction to take, while the others are critical for proper functioning for, e.g., SQL.

An optimizer needs several mechanisms to be effective. It should be able to perform a symbolic evaluation of a code fragment and collect the result in properties for further decision making. The prototypical case is where an optimizer estimates the result size of a selection. Another major issue is to be able to generate and explore a space of alternative evaluation plans. This exploration may take place upfront but can also be run at runtime for query fragments.

Optimizer landscape

A query optimizer is often a large and complex piece of code, which enumerates alternative evaluation plans from which "the best" plan is selected for evaluation. Limited progress has been made so far to decompose the optimizer into (orthogonal) components because it is a common belief in research that a holistic view on the problem is a prerequisite to finding the best plan. Conversely, commercial optimizers use a cost-model driven approach, explore part of the space using limited rewriting rules (up to 300).

Our hypothesis is that query optimization should be realized with a collection of Query Optimizer Transformers (QOT), each dedicated to a specific task. Furthermore, they are assembled in scenarios to support specific application domains or achieve a desired behavior. Such scenarios are selected on a session basis, a query basis, or dynamically at runtime; they are part of the query plan.

The query transformer list below is under consideration for development. For each we consider its goal, approach, and expected impact. Moreover, the minimal prerequisites identify the essential optimizers that should have done their work already. For example, it doesn't make sense to perform a static evaluation unless you have already propagated the constants using Alias Removal.

  • Constant expressions
    • Goal: to remove scalar expressions which need be evaluated once during the query lifetime.
    • Rationale: static expressions appear when variables used denote literal constants (e.g. 1+1), when catalog information can be merged with the plan (e.g. max(B.salary)), when session variables are used which are initialized once (e.g. user()). Early evaluation aids subsequent optimization.
    • Approach: inspect all instructions to locate static expressions. Whether they should be removed depends on the expected re-use, which in most cases call for an explicit request upon query registration to do so. The result of a static evaluation provides a ground for alias removal.
    • Impact: relevant for stored queries (MAL functions)
    • Prereq: alias removal, common terms.
  • Alias Removal
    • Goal: to reduce the number of variables referenceing the same value, thereby reducing the analysis complexity.
    • Rationale: query transformations often result in replacing the right-hand side expression with a result variable. This pollutes the code block with simple assignments e.g. V:=T. Within the descendant flow the occurrence of V could be replaced by T, provided V is never assigned a new value.
    • Approach: literal constants within a MAL block are already recognized and replaced by a single variable.
    • Impact: medium.
  • Common Term Optimizer
    • Goal: to reduce the amount of work by avoiding calculation of the same operation twice.
    • Rationale: to simplify code generation for front-ends, they do not have to remember the subexpressions already evaluated. It is much easier to detect at the MAL level.
    • Approach: simply walk through the instruction sequence and locate identical patterns. (Enhance is with semantic equivalent instructions)
    • Impact: High
    • Prereq: Alias Removal
  • Dead Code Removal
    • Goal: to remove all instructions whose result is not used
    • Rationale: due to sloppy coding or alternative execution paths dead code may appear. As XML Pathfinder is expected to produce a large number of simple assignments.
    • Approach: Every instruction should produce a value used somewhere else.
    • Impact: low
  • Operator Sort
    • Goal: to sort the dataflow graph in such a way as to reduce the cost, or to assure locality of access for operands.
    • Rationale: A simple optimizer is to order the instructions for execution by permutation of the query components
    • Approach:
    • Impact:
  • Singleton Set
    • Goal: to replace sets that are known to produce precisely one tuple.
    • Rationale: Singleton sets can be represented by value pairs in the MAL program, which reduces to a scalar expression.
    • Approach: Identify a set variable for replacement.
    • Impact:
  • Range Propagation
    • Goal: look for constant ranges in select statements and propagate them through the code.
    • Rationale: partitioned tables and views may give rise to expressions that contain multiple selections over the same BAT. If their arguments are constant, the result of such selects can sometimes be predicted, or the multiple selections can be cascaded into a single operation.
    • Impact: high, should be followed by alias removal and dead code removal
  • Result Cacher
    • Goal: to reduce the processing cost by keeping track of expensive to compute intermediate results
    • Rationale:
    • Approach: result caching becomes active after an instruction has been evaluated. The result can be cached as long as its underlying operands remain unchanged. Result caching can be made transparent to the user, but affects the other quer optimizers.
    • Impact: high
  • Staged Execution
    • Goal: to split a query plan into a number of steps, such that the first response set is delivered as quickly as possible. The remainder is only produced upon request.
    • Rationale: interactive queries call for quick response and an indication of the processing time involved to run it too completion.
    • Approach: staged execution can be realized using a fragmentation scheme over the database, e.g. each table is replaced by a union of fragments. This fragmentation could be determined upfront by the user or is derived from the query and database statistics.
    • Impact: high
  • Code Parallizer
    • Goal: to exploit parallel IO and cpu processing in both SMP and MPP settings.
    • Rationale: throwing more resources to solve a complex query helps, provided it is easy to determine that parallel processing recovers the administrative overhead.
    • Approach: every flow path segment can be handled by an independent process thread.
    • Impact: high
  • Query Evaluation Maps
    • Goal: to avoid touching any tuple that is not relevant for answering a query.
    • Rationale: the majority of work in solving a query is to disgard tuples of no interest and to find correlated tuples through join conditions. Ideally, the database learns these properties over time and re-organizes (or builts a map) to replace disgarding by map lookup.
    • Approach: piggyback selection and joins as database fragmentation instructions
    • Impact: high
  • MAL Compiler (tactics)
    • Goal: to avoid interpretation of functional expressions
    • Rationale: interpretation of arithmetic expressions with an interpreter is always expensive. Replacing a complex arithmetic expressin with a simple dynamically compiled C-functions often pays off. Especially for cached (MAL) queries
    • Approach:
    • Impact: high
  • Dynamic Query Scheduler (tactics)
    • Goal: to organize the work in a way so as to optimize resource usage
    • Rationale: straight interpretation of a query plan may not lead to the best use of the underlying resources. For example, the content of the runtime cache may provide an opportunity to safe time by accessing a cached source
    • Approach: query scheduling is the last step before a relation algebra interpreter takes over control. The scheduling step involves a re-ordering of the instructions within the boundaries imposed by the flow graph.
    • Impact: medium
  • Aggregate Groups
    • Goal: to reduce the cost of computing aggregate expressions over times
    • Rationale: many of our applications call for calculation of aggregates over dynamically defined groupings. They call for lengtly scans and it pays to piggyback all aggregate calculates, leaving their result in the cache for later consumption (eg the optimizers)
    • Approach:
    • Impact: high
  • Demand Driven Interpreter (tactics)
    • Goal: to use the best interpreter and libraries geared at the task at hand
    • Rationale: Interpretation of a query plan can be based on different computational models. A demand driven interpretation starts at the intended output and 'walks' backward through the flow graph to collect the pieces, possibly in a pipelined fashion. (Vulcano model)
    • Approach: merely calls for a different implementation of the core operators
    • Impact: high
  • Iterator Strength Reduction
    • Goal: to reduce the cost of iterator execution by moving instructions out of the loop.
    • Rationale: although iteration at the MAL level should be avoided due to the inherent low performance compared to built-in operators, it is not forbidden. In that case we should confine the iterator block to the minimal work needed.
    • Approach: inspect the flowgraph for each iterator and move instructions around.
    • Impact: low
  • Accumulator Evaluation
    • Goal: to replace operators with cheaper ones.
    • Rationale: based on the actual state of the computation and the richness of the supporting libraries there may exists alternative routes to solve a query.
    • Approach: Operator rewriting depends on properties. No general technique. The first implementation looks at calculator expressions such as they appear frequently in the RAM compiler.
    • Impact: high
    • Prereq: should be called after common term optimizer to avoid clashes.
    • Status: Used in the SQL optimizer.
  • Code Inliner
    • Goal: to reduce the calling depth of the interpreter and to obtain a better starting point for code squeezing
    • Rationale: substitution of code blocks (or macro expansion) leads to longer linear code sequences. This provides opportunities for squeezing. Moreover, at runtime building and managing a stackframe is rather expensive. This should be avoided for functions called repeatedly.
    • Impact: medium
    • Status: Used in the SQL optimizer to handle SQL functions.
  • Code Outliner
    • Goal: to reduce the program size by replacing a group with a single instruction
    • Rationale: inverse macro expansion leads to shorter linear code sequences. This provides opportunities for less interpreter overhead, and to optimize complex, but repetative instruction sequences with a single hardwired call
    • Approach: called explicitly to outline a module (or symbol)
    • Impact: medium
  • Garbage Collector
    • Goal: to release resources as quickly as possible
    • Rationale: BATs referenced from a MAL program keep resources locked.
    • Approach: In cooperation with a resource scheduler we should identify those that can be released quickly. It requires a forced gargabe collection call at the end of the BAT's lifespan.
    • Impact: high
    • Status: Implemented. Algorithm based on end-of-life-span analysis.
  • Foreign Key replacements
    • Goal: to improve multi-attribute joins over foreign key constraints
    • Rationale: the code produced by the SQL front-end involves foreign key constraints, which provides many opportunities for speedy code using a join index.
    • Impact: high
    • Status: Implemented in the SQL strategic optimizer.