How to get better estimation?

Kai 2016-01-22 05:22:30


In the query below (second statement in the list), the est vs actual is 4,460,140 vs 239,974 and this seems to slow the query performance. Is there any way to improve the est and performance?


SQLkiwi 2016-01-23 13:11:37
A few quick comments, because I am short of time right now:

  • The optimizer does not currently reorder NOT EXISTS at the same level, so you need to either express the query differently, or place the most selective NOT EXISTS clause first. From what I can see, this appears to be VanReconciledLog.
  • You should ensure statistics are reasonably representative of the underlying data distribution. This can be tricky with large tables and multi-column predicates, but you could consider filtered statistics on VanHistory for ActivityId = 4 and (separately) for ActivityId = 16. It may even be worth creating filtered indexes (which would create filtered statistics automatically, of course).
  • The TOP (1) subqueries need an ORDER BY clause to be deterministic. This could also remove the index and lazy spools if #SelectedCreditLimits is suitably indexed to support the predicates and order by.
  • The query is doing a bit too much in one go. The CASE expressions in particular mean that attributes are being read from tables even though they may not be used for that row. Consider splitting the query in two, broadly separating the CTE from the rest. Aim for a relatively small number of rows, with minimal key information. You should aim for no more than a few hundred thousand rows, based on the plan submitted. A second step would use that table to fetch columns needed for the CASE expressions etc. as needed.

The aim here is to improve the cardinality estimations (primarily through filtered indexes/statistics and reordering the not-exists clauses) so the hashes and sorts no longer spill. Removing the spools will help a bit (and ensure meaningful results). Breaking the query into two, if done well, should improve both estimates and overall execution time.

Once good performance is achieved, you may consider carefully recombining the two queries into one IF this improves performance meaningfully, and results in a stable execution plan being selected for different parameter values.

Finally, consider bringing the instance up to date with 2014 SP1 CU4 (you are currently on plain SP1). If nothing else, this will provide more information about the severity of the spills.

Kai 2016-01-25 05:59:16
Thanks so much for the tips. Another question for another statement in the same plan. The estimate vs actual is 1 vs 240,406. I cannot figure out the cause. What is wrong with it?