We think the query is running slowly because we are missing statistics or statistics and bad. Does it make sense? Does the query too complexe?
By Alain Martin 1 asked Sep 14, 2016 at 04:02 PM
It is certainly possible that out of date or missing statistics are contributing to the poor performance of this query, but there are a variety of other issues.
You might get lucky and find that updating the existing statistics produces a less disastrous execution plan, but the fundamental problems would remain. Unfortunately, the Index Analysis feature did not capture as much index and statistics information as I would have liked in this case, for whatever reason (permissions perhaps?)
For example, one simple cardinality misestimation occurs early on at a fairly straightforward operation:
The seek condition is
It's hard to know how much freedom you have to make changes, but it is very likely that one or both views referenced in the query are not written in as optimizer-friendly fashion as possible. There are a number of plan features that strongly hint at unusual SQL constructions in one or both of the views. For example, sections like this:
The Segment, Sequence Project, Join, and Stream Aggregate are particularly undesirable given the high row counts, and certainly the 32 million index seeks involved there. The Segment-Project-Aggregate pattern (containing an ANY aggregate) is suggestive of an aggregate correlated subquery (x = (SELECT MAX(y) ...) which is easy to write, but rarely performs well in practice. Without being able to see the view definition, it is difficult to be certain about the syntax that causes this pattern to surface, but there can be no doubt that the resulting operations are extremely expensive. This pattern repeats a few times in the plan, so it is definitely worth reviewing.
There is nothing wrong with nested loops joins per se, but it is in their nature that inaccurate cardinality estimations can lead to unexpectedly poor performance. Nested loops joins also benefit from clear navigational SQL syntax, and high-quality supporting indexes. Few of the indexes in the execution plan appear to be optimal, with many missing the columns needed to be covering, and some have a column with very low selectivity (`PARTITION`) as the leading key. This can be important because the statistics histogram associated with the index is built on the leading column only. There appears to be only one value for
There are also instances where
The large-ish row counts and multiple spools means this is another candidate for review. It is sometimes possible to express the same logic using different syntax (typically
The query itself is clearly generated, but there's not too much wrong with it, once the excessive parentheses are removed and the misleading cross join is replaced with the inner join actually specified. The optimizer is able to do this, so the problems really are rooted in the design of the tables, existing indexes (and possibly statistics), but most importantly in the view definitions.
If improving the views is not achievable for whatever reason, you could hand-write a specific query against the base tables instead. This is often a productive way to go, but it can be time-consuming, and may lead to maintenance problems since the query is now decoupled from whatever business logic is in the views.
By SQLkiwi ♦ 6.6k answered Sep 15, 2016 at 05:52 AM