What Exactly Does the Query Optimizer Consider with Parallel Plans?
It seems to me that there could be several parallel versions of a plan. These plans would have streams gathered and/or repartitioned at different places. Is that the case? If so, wouldn't considering parallelism multiply the number candidate plans by a large factor?
The designers were very aware of the potential for a combinatorial explosion so the optimizer prunes and discards node/group alternatives as soon as it becomes aware that the current task is unlikely to be part of a good low-cost plan overall.
The optimizer does not consider only whole execution plans, it works node by node, group by group, with all sorts of heuristics to prevent the search space growing too large too quickly. The memo structure also means that groups are never duplicated – if a node appears in two alternatives, the reference is reused rather than regenerated.
One of the safeguards is that cost-based optimization is separated out into stages (Transaction Processing, Quick Plan, Parallel Quick Plan, Full). Early stages do not consider parallel plans, and later stages are only run if entry conditions (including the cost of the best complete plan found so far) are met.
If parallelism is explored in the Parallel Quick Plan stage, the optimizer decides at the end of that stage whether the final plan will use parallelism or not.
The implementation of parallel execution exploration is reasonably straight forward. The logical tree is explored during Parallel Quick Plan almost exactly as it was during Quick Plan, but a parallel plan is required. This is achieved by setting the 'parallel execution' property to true at the root of the tree. Previous stages ran with this set to false.
Exchanges are introduced as necessary (following standard rules) for logical operations that support parallelism (for example parallel group by requires a stream partitioned by the group-by keys). A final step ensures that the parallel plan produced will return correct results.
Heuristic decisions are also made where (for example) different partitioning policies are possible. A hash join may be preceded by exchanges partitioning the join keys on both inputs, or it may use broadcast partitioning on just the build input. The optimizer does not generate both, it decides heuristically which one to go with.
All these things taken together mean that reasonable parallel plans can be generated in a reasonable space and time. I will be talking about parallelism, query plans, and the optimizer in depth at PASS Summit 2013.