Currently there's an issue in our system where an execution plan for a given sproc will cause it to completely stop working for certain permutations of parameters. Currently to get around this I added recompile to the sproc (the sproc only has one query) which is working fine and it's no longer locking up. I'm in the process of rewriting it so that it no longer uses 12 different TVF's with one massive select..
The structure of all the tables in this system use the following rules for clustering
CompanyId, Identity, Possibly some additional field
All joins between tables always include the CompanyId
The following query would yield the exact same results
What I've noticed is that whenever I have a join with CompanyId it appears to perform better (less I/O, less CPU, returns faster, lower subtree cost) then if I just join on the Identity Columns. But the Row Estimates are always wrong when I join using CompanyId, Ids vs just joining on the Ids. I'm not completely clear on the best way to utilize this information :-) Or if it doesn't really matter that much...
I've added covering nonclustered indexes on all the identity columns that cover the query.... One of my thoughts is that it's better to just leave the CompanyId off the queries as the cardinality of the data varies a lot between companies so it's just safer to have the queries always running using the covering nonclustered indexes that produce valid estimates but I'm also a bit confused why leaving the companyid on always causes the actual rows to be 90 percent greater than Est Rows...
I've made sure to update the statistics with FULL SCAN on all involved tables as well just to ensure that is all up to date.
I'm also curious if I'm just over analyzing something that makes a marginal amount of difference :-) At this point i have all the TVF's removed and have it broken up nicely but I'm just a bit concerned about the row estimates consistently being off when the joins include CompanyId
By Shane 33 asked Mar 19, 2015 at 08:52 AM
I would almost never add
Working from the Plan Explorer session file provided, there isn't much to choose between the two options in terms of performance. Both are relatively trivial, low-cost queries with execution times in the millisecond range. The higher number of logical reads associated with the second option is still a relatively small number, and there is no physical I/O involved. Higher logical reads are to be expected with nested loops joins; they are pretty harmless where a small number of rows are involved.
Where the join predicate involves multiple columns, the optimizer's cardinality estimation component is likely to underestimate, particularly if one part of the predicate is effectively redundant. This is a consequence of some of the simplifying assumptions made - the optimizer does not generally expect a predicate to have zero impact on the selectivity. The situation is slightly different in SQL Server 2014 with the new cardinality estimator enabled, but as you are using SQL Server 2008 R2 that doesn't apply here.
The under-estimation does not cause any significant issues with the sample queries, because there are a wide variety of plans that would all perform reasonably similarly, due primarily to the appropriate indexing and small number of rows involved. To put it plainly, almost any semi-reasonable plan chosen by the optimizer for these queries would perform pretty well.
That said, the second query is the one I would probably prefer:
The accurate estimates would give me confidence that the chosen plan has at least some factual basis. The alternative plan is certainly estimated to be cheaper, but that's simply because the optimizer is erroneously expecting fewer rows.
The purely-nested-loops plan will also start returning rows more quickly than the hash join plan, where the hash table has to be fully populated before the first probe-side row can be returned to the client. You'd be hard-pressed to measure this within a 3-4ms duration of course.
So, none of this really matters in a practical sense right now. But, if the query were to be extended or amended in future, starting from a sensible point (with accurate estimates) is more likely to be helpful than the apparently-better plan based on an unsafe assumption. So my preference would be for the join without the redundant
By SQLkiwi ♦ 6.6k answered Mar 19, 2015 at 09:25 AM
Yea actually option(recompile) on the queries is what I've been doing but that's definitely not what I said :-/ it's good to know that's preferred. The system has had a number of issues with large reporting queries in sprocs just going off into lala land... So with the help of alerts and whoisactive I've been bandaiding recompiles around and now I'm in the process of rewriting.
I tested this on Azure V12 (because that's where we're going in a few months) and it produced an identical plan but 2014 is interesting. It turns that Table4.Index6 into an Index Scan with a really bad estimate (120k estimated 2k actual) and then does a merge join. I'm assuming it sees that the nonclustered index is already sorted and then decides sorting the top input for a merge join is better then doing a Nested Join. Just to ensure it was a 2014 thing I changed the compatibility level to 110 and it produced the old plan and then 120 creates the different plan... If I WITH(FORCESEEK) then it produces the same plan as before...... Interesting :-)
This is good to know because the posted query is just a VERY small subset of the massive reporting query who's estimates are all over the place so now I'll be amending it back up and parsing it out and just wanted to start from a good foundation
Thank you very much for the very informative reply also this post was especially helpful to me yesterday as I've been rewriting these sprocs http://sqlblog.com/blogs/paul_white/archive/2012/08/15/temporary-tables-in-stored-procedures.aspx so thanks for that as well :-)
By Shane 33 answered Mar 20, 2015 at 12:58 AM