This plan is generating 96 million reads, anything you can point it that could be improve?

alhedo 2013-10-28 21:14:20

SQLkiwi 2013-10-28 23:21:12
A non-anonymized plan would be most helpful, as others have mentioned. As it is, the plan seems remarkably reasonable.

Most cardinality estimates are ballpark, which is lucky given that the plan is driven to a large extent by a recursive common table expression. These are hard to estimate for in general. There are a couple of plan areas that are concerning, which I'll come to in a moment.

The query seems quite large and complex, so I would start by breaking it down a bit. A natural place to start would be to run the recursive common table expression separately and store the results in a temporary table. The CTE is referenced twice, and fully computed twice in the query plan.

Materializing it would remove this duplication of effort, and result in much better cardinality and distribution information. You would also be free to create any potentially helpful indexes on the temporary table, giving the optimizer even better information to work with.

There are a number of scans and seeks in this plan that produce a large number of rows. I would look at those to see if more selective indexing could be provided. It might also be that the optimizer is choosing a strategy based on a smaller expected number of rows, so you might see the plan shape change once the CTE has been materialized. Nevertheless, you will want to avoid operators that produce many millions of rows where it can be avoided.

There appear to be a number of simple aggregations on large numbers of rows in the plan. Consider pre-calculating these using indexed views. If you are not using Enterprise Edition, you will need to replace the aggregations explicitly with view references using the NOEXPAND hint.

The query uses parallelism, but is limited to DOP 4. The query appears to be mostly CPU-bound, so you could think about making more processors available for this task.

The memory grant for the query is excessive. Around 1.2GB is reserved, though I cannot be sure from the anonymized plan on your version of SQL Server how much of that is used at runtime. At least some of your tuning effort should be directed to reducing that memory grant. If you are able to use indexed views instead of big hash aggregations, that will help.

The index seeks on Table8.Index4 and Table6.Index5 are executed 10,724,610 times each. The estimates are good in this area of the plan, so you need to look carefully at the query logic and indexes to see if this can be improved. Both index seeks have residual predicates which you may be able to incorporate into the seek by changing your indexing, adding computed columns, or some other action.

More generally, the plan is probably too large to expect great results from the optimizer, and it is likely reasonably difficult to read and maintain. It is easy to visually break the query into sections that produce reasonable row counts. I would definitely break this query up into simpler sections using temporary tables. This would go well beyond the recursive CTEs, to be clear about it.

alhedo 2013-10-30 13:14:07
Moving the CTE to a temp table did improve the performance quite a bit. There were a couple of outer applys at the end of the query; I moved them to the select and that reduce the reads, is that because of the logical order of the execution? the FROM clause being analyzed before the SELECT?

Thanks for all the suggestions, much appreciated.

SQLkiwi 2013-10-30 20:20:05
Mostly, yes. Two things. (1) The optimizer doesn't have a huge number of tools available to move outer joins around. (2) In a query with many joins, the final plan shape will often be a lot closer to the written order of the query, simply because there are too many join order combinations for the optimizer to look at.
alhedo 2013-11-01 15:54:51
Ok I understand. Thanks again Paul!
John Couch 2013-10-28 21:21:54
Hard to say, but I see numerous Index Scans, a bookmark lookup, and curious why the cardinality estimates are off. This looks like an estimated plan, do you have an actual so we can see exactly what the actual rows vs. est look like?
alhedo 2013-10-28 21:30:32
Thanks, I uploaded the actual plan. Seems like the estimated rows are way off on some cases.
John Couch 2013-10-28 21:39:20
Looks like you have some memory grant bloat occurring as the two largest operations by cost are both only needing < 5MB for 1 and the other is 817KB, yet you are allocating almost 200MB for both.
Dan Holmes 2013-10-28 21:57:24
this is going to be a hard one as an anonymized plan. Table3.index12 is scanned 3 times. That would be worth looking at index optimization or SQL tuning.

Probably something to be done in the top branch on the HASH branching to table6.index5. 2920 rows come in and go out yet that read has 1.7 million rows included to produce the same row counts before and after.