Statistics problem?

Alain Martin 2016-09-14 16:02:24


We think the query is running slowly because we are missing statistics or statistics and bad. Does it make sense? Does the query too complexe?


SQLkiwi 2016-09-15 05:52:00
It is certainly possible that out of date or missing statistics are contributing to the poor performance of this query, but there are a variety of other issues.

You might get lucky and find that updating the existing statistics produces a less disastrous execution plan, but the fundamental problems would remain. Unfortunately, the Index Analysis feature did not capture as much index and statistics information as I would have liked in this case, for whatever reason (permissions perhaps?)

For example, one simple cardinality misestimation occurs early on at a fairly straightforward operation:

alt text

The seek condition is ModelSecurityRole.UTILTYPE = 133, with an estimate of 1 row, but 388 rows actual. You should check that the statistics for this column are representative, and refresh them if not. It is generally worth working right to left with cardinality estimation errors since an initial misestimate potentially affects all later operations. I very much doubt that this one correction will solve your problem entirely, but it is a start.

It's hard to know how much freedom you have to make changes, but it is very likely that one or both views referenced in the query are not written in as optimizer-friendly fashion as possible. There are a number of plan features that strongly hint at unusual SQL constructions in one or both of the views. For example, sections like this:

alt text

The Segment, Sequence Project, Join, and Stream Aggregate are particularly undesirable given the high row counts, and certainly the 32 million index seeks involved there. The Segment-Project-Aggregate pattern (containing an ANY aggregate) is suggestive of an aggregate correlated subquery (x = (SELECT MAX(y) …) which is easy to write, but rarely performs well in practice. Without being able to see the view definition, it is difficult to be certain about the syntax that causes this pattern to surface, but there can be no doubt that the resulting operations are extremely expensive. This pattern repeats a few times in the plan, so it is definitely worth reviewing.

The FAST 20 hint on the query might be AX system-generated, but whatever the case, it does tend to have the effect of lowering cardinality estimates below blocking operators, so encouraging the optimizer to choose nested loops joins. If you can, try the query without this hint.

There is nothing wrong with nested loops joins per se, but it is in their nature that inaccurate cardinality estimations can lead to unexpectedly poor performance. Nested loops joins also benefit from clear navigational SQL syntax, and high-quality supporting indexes. Few of the indexes in the execution plan appear to be optimal, with many missing the columns needed to be covering, and some have a column with very low selectivity (PARTITION) as the leading key. This can be important because the statistics histogram associated with the index is built on the leading column only. There appears to be only one value for PARTITION in the database, so this is a bit of a waste of a histogram, even if other useful column statistics exist.

There are also instances where MAX aggregates over the ModelSecurityRole table are used with large row counts as input. One of these has the typical MAX … OVER plan pattern:

alt text

The large-ish row counts and multiple spools means this is another candidate for review. It is sometimes possible to express the same logic using different syntax (typically TOP (1)) which can be made much more efficient with proper indexing. Again, this will be buried somewhere in one of the views.

The query itself is clearly generated, but there's not too much wrong with it, once the excessive parentheses are removed and the misleading cross join is replaced with the inner join actually specified. The optimizer is able to do this, so the problems really are rooted in the design of the tables, existing indexes (and possibly statistics), but most importantly in the view definitions.

If improving the views is not achievable for whatever reason, you could hand-write a specific query against the base tables instead. This is often a productive way to go, but it can be time-consuming, and may lead to maintenance problems since the query is now decoupled from whatever business logic is in the views.