Valid estimates vs what appears to be a better performing query

Currently there's an issue in our system where an execution plan for a given sproc will cause it to completely stop working for certain permutations of parameters. Currently to get around this I added recompile to the sproc (the sproc only has one query) which is working fine and it's no longer locking up. I'm in the process of rewriting it so that it no longer uses 12 different TVF's with one massive select..

The structure of all the tables in this system use the following rules for clustering

CompanyId, Identity, Possibly some additional field

All joins between tables always include the CompanyId

 from other o
 join thing t 
 on o.CompanyId = t.CompanyId and o.Id = t.OtherId

The following query would yield the exact same results

 from other o
 join thing t 
 on o.Id = t.OtherId


What I've noticed is that whenever I have a join with CompanyId it appears to perform better (less I/O, less CPU, returns faster, lower subtree cost) then if I just join on the Identity Columns. But the Row Estimates are always wrong when I join using CompanyId, Ids vs just joining on the Ids. I'm not completely clear on the best way to utilize this information :-) Or if it doesn't really matter that much...

I've added covering nonclustered indexes on all the identity columns that cover the query.... One of my thoughts is that it's better to just leave the CompanyId off the queries as the cardinality of the data varies a lot between companies so it's just safer to have the queries always running using the covering nonclustered indexes that produce valid estimates but I'm also a bit confused why leaving the companyid on always causes the actual rows to be 90 percent greater than Est Rows...

I've made sure to update the statistics with FULL SCAN on all involved tables as well just to ensure that is all up to date.

I'm also curious if I'm just over analyzing something that makes a marginal amount of difference :-) At this point i have all the TVF's removed and have it broken up nicely but I'm just a bit concerned about the row estimates consistently being off when the joins include CompanyId

Plan.pesession (206.5 kB)
avatar image By Shane 33 asked Mar 19, 2015 at 08:52 AM
more ▼
(comments are locked)
10|10000 characters needed characters left

2 answers: sort voted first

Currently to get around this I added recompile to the sproc

I would almost never add WITH RECOMPILE to a stored procedure. I would instead add an OPTION (RECOMPILE) query hint to the statement(s) inside the procedure that require it.

What I've noticed is that whenever I have a join with CompanyId it appears to perform better (less I/O, less CPU, returns faster, lower subtree cost) then if I just join on the Identity Columns. But the Row Estimates are always wrong when I join using CompanyId, Ids vs just joining on the Ids. I'm not completely clear on the best way to utilize this information :-) Or if it doesn't really matter that much...

Working from the Plan Explorer session file provided, there isn't much to choose between the two options in terms of performance. Both are relatively trivial, low-cost queries with execution times in the millisecond range. The higher number of logical reads associated with the second option is still a relatively small number, and there is no physical I/O involved. Higher logical reads are to be expected with nested loops joins; they are pretty harmless where a small number of rows are involved.

Where the join predicate involves multiple columns, the optimizer's cardinality estimation component is likely to underestimate, particularly if one part of the predicate is effectively redundant. This is a consequence of some of the simplifying assumptions made - the optimizer does not generally expect a predicate to have zero impact on the selectivity. The situation is slightly different in SQL Server 2014 with the new cardinality estimator enabled, but as you are using SQL Server 2008 R2 that doesn't apply here.

The under-estimation does not cause any significant issues with the sample queries, because there are a wide variety of plans that would all perform reasonably similarly, due primarily to the appropriate indexing and small number of rows involved. To put it plainly, almost any semi-reasonable plan chosen by the optimizer for these queries would perform pretty well.

That said, the second query is the one I would probably prefer:

Nested loops plan

The accurate estimates would give me confidence that the chosen plan has at least some factual basis. The alternative plan is certainly estimated to be cheaper, but that's simply because the optimizer is erroneously expecting fewer rows.

The purely-nested-loops plan will also start returning rows more quickly than the hash join plan, where the hash table has to be fully populated before the first probe-side row can be returned to the client. You'd be hard-pressed to measure this within a 3-4ms duration of course.

So, none of this really matters in a practical sense right now. But, if the query were to be extended or amended in future, starting from a sensible point (with accurate estimates) is more likely to be helpful than the apparently-better plan based on an unsafe assumption. So my preference would be for the join without the redundant CompanyId predicate, but you shouldn't be overly concerned either way.

sp.png (30.3 kB)
avatar image By SQLkiwi ♦ 6.6k answered Mar 19, 2015 at 09:25 AM
more ▼
(comments are locked)
10|10000 characters needed characters left

Yea actually option(recompile) on the queries is what I've been doing but that's definitely not what I said :-/ it's good to know that's preferred. The system has had a number of issues with large reporting queries in sprocs just going off into lala land... So with the help of alerts and whoisactive I've been bandaiding recompiles around and now I'm in the process of rewriting.

I tested this on Azure V12 (because that's where we're going in a few months) and it produced an identical plan but 2014 is interesting. It turns that Table4.Index6 into an Index Scan with a really bad estimate (120k estimated 2k actual) and then does a merge join. I'm assuming it sees that the nonclustered index is already sorted and then decides sorting the top input for a merge join is better then doing a Nested Join. Just to ensure it was a 2014 thing I changed the compatibility level to 110 and it produced the old plan and then 120 creates the different plan... If I WITH(FORCESEEK) then it produces the same plan as before...... Interesting :-)

But, if the query were to be extended or amended in future

This is good to know because the posted query is just a VERY small subset of the massive reporting query who's estimates are all over the place so now I'll be amending it back up and parsing it out and just wanted to start from a good foundation

Thank you very much for the very informative reply also this post was especially helpful to me yesterday as I've been rewriting these sprocs http://sqlblog.com/blogs/paul_white/archive/2012/08/15/temporary-tables-in-stored-procedures.aspx so thanks for that as well :-)

avatar image By Shane 33 answered Mar 20, 2015 at 12:58 AM
more ▼
(comments are locked)
10|10000 characters needed characters left
Your answer
toggle preview:

Up to 50 attachments (including images) can be used with a maximum of 209.7 MB each and 209.7 MB total.

Follow this question

Topics:

x631
x166
x26
x25
x12

asked: Mar 19, 2015 at 08:52 AM

Seen: 99 times

Last Updated: Mar 20, 2015 at 04:50 AM