Interesting Performance problem

Gaurav 2014-05-01 15:11:34

Hello Team – I'm having a interesting performance problem. When I'm running attached SQL server query then it takes hours to complete. There is no CPUIONetwork spike and blocking during that time. Attached is the execution plan that I captured on 30th April, I found the expensive cost over “Bookmark key lookup” operator (68%). Since there are many columns in select statement, I can't create a covered index due to index size and other problems. So bookmark key lookup persists here. Is there any other way to optimize this query ?

Interestingly when I captured the plan today (1st May) there is no “Bookmark key lookup” operator. No index modification done. Why it's not using that operator now since query is same ? Attaching today's plan and index details also.

FYI : Updating IndexStats on daily basis.

Will appreciate your response on this. Thanks.

SQLkiwi 2014-05-01 16:34:01
>Why it's not using that operator now since query is same?

The query you write is just a logical specification of the results you require. SQL Server can choose from a variety of physical execution strategies to produce the results. Which physical plan it chooses depends heavily on statistical information. As the statistical information changes, so might the physical execution plan choice.

With a large and complex query involving a large number of joins, there are more potential strategies than SQL Server has time to consider, so heuristic rules are used to narrow down the search space. In addition, the chances that SQL Server's estimates are accurate quickly diminish as the size of the query increases.

Is there any other way to optimize this query?

Of course. It is clearly not reasonable for a query that accesses such small tables and returns so few rows to run for several hours.

The 68% cost percentage you see on the key lookup is just an optimizer estimate, even in a post-execution plan with 'actual' row counts. You should not base your tuning efforts on these numbers alone, because estimates frequently do not match the reality. One reason is that estimated row counts might not match actual row counts, meaning the whole basis for the optimizer's execution plan choice was unsound in the first place. This is much more likely in complex query plans where repeated estimations compound errors.

The query plans you posted have a number of other potential issues that look much more serious than the key lookup. There are far too many of these to detail, but most relate to a poor estimate of the number of rows that will be encountered. Some operations, like sorts and hashing, are very sensitive to estimation errors. In any case, inaccurate estimations mean SQL Server will likely choose the wrong physical execution strategy.


Fundamentally, the query just tries to do too much in one step. Quite aside from the problems SQL Server faces in choosing a reasonable execution plan, it looks like a bit of a nightmare for a human to maintain.

You should rewrite the query in a smaller number of steps, using temporary tables to store intermediate results. This will help people understand and maintain the query, make it easier to identify suboptimal indexing or coding practices, and most of all it will help SQL Server produce an efficient execution plan based on accurate estimations.

The query is far too large to attempt such a rewrite here, but this is my advice. Break the query into smaller steps with queries you can tune individually. The intermediate result set sizes should be small, and you might consider adding helpful indexes to the temporary tables as you go. This process might even allow you to remove that final SELECT DISTINCT you are doing. This is normally a sign that the query introduces duplicates unintentionally. Breaking the query down will help you identify where this first occurs, and take steps to eliminate it.

And remove all those NOLOCK hints. If you really want to run at READ UNCOMMITTED isolation, use a SET TRANSACTION ISOLATION LEVEL statement.

Finally, note both query plans you posted request a memory grant of between 2.5 and 10GB. This is again a consequence of inaccurate estimations. There are few big operations in the query that require significant memory, breaking the query up will help you tune this as well.

Edward Norris 2014-05-01 16:51:34
You bring up fantastic points. I always first attempt to give a quick answer to solve the problem at hand, instead of stepping back and re-structuring the entire item. I believe mine comes from having the pressure to "improve with minimal effort" and not the "improve by potentially taking the extra time needed to re-write it and make it correct. But that is up to how much time Gaurav is given (or has left) on the project.
SQLkiwi 2014-05-01 17:05:48
Edward, I understand, and that's fine. I tried to focus on a solution approach that would solve multiple fundamental problems rather than trying to find a quick patch.
Gaurav 2014-05-02 05:28:48
I appreciate your detailed response. I got it that engine have chosen a different plan 2nd time (I cleared the plan cache also through dbcc freeproccache) and it didn't use "Bookmark key lookup" operator. But I'm wondering why its not using that operator because fundamentally when you're having multiple columns in select stmt. and those were not part of your noncluster index, engine should use that operator by using cluster key and access those columns data and that's how it show the result.

Please let me know if I'm understanding it correct. Thanks again !

SQLkiwi 2014-05-02 13:42:15
The choice is a cost-based decision made by the optimizer. For the second compilation, a clustered index scan looked cheaper – probably because of a change in statistical information. I've already explained this in the main answer text.
Gaurav 2014-05-04 04:46:05
SQLKiwi – I did more analysis and checked the xml plan for both graphical execution plan and found that memory grant is different for these 2 plans. If you see 30th April plan requires less memory compare than 1st may plan, and on 30th April memory was not sufficient on server to pick a good plan and hence key lookup happened but on 1st may memory was enough to get a good plan.
Gaurav 2014-05-09 11:27:54
Hi SQLKiwi – One more thing that I'm updating the stats on daily basis in such way that if one of the row is modified in a table then it will update the stats. from 15th April to 30th April this query was executing slow, if I was updating the stats on daily basis then why the plan was not invalidating in plan cache since I'm having SQL 2008 R2 + SP2 + CU5 version. But on 1st May I cleared the plan cache and from then only query is executing fast. Any more suggestion, will appreciate.
SQLkiwi 2014-05-04 05:02:58
Yes the memory grant is different, but that's a consequence of the optimizer choosing a different plan shape. Memory grant is determined based on the plan shape chosen, not the other way around.
Gaurav 2014-05-14 19:35:05
Thanks SQLKiwi again ! But here interesting part is we were having this problem for whole April month and daily we are updating the stats (with FULLSCAN) then why SQL Server did not created a good plan in that whole month. Since I cleared the cache on 1st May, it created a good plan and till now its executing with out any problem. I'm having SQL Server 2008 R2 version so it should not be a case of parameter sniffing, it should invalidate all past plans after updating the stats. Any clue ?
Gaurav 2014-05-20 07:32:39
Hey Kiwi – Thanks for the response again ! If next time same issue comes, you want to recommend to save some data so that we can do an analysis ?
SQLkiwi 2014-05-14 21:24:15
Clearly something changed. Without a time ship and physical access to your system, it's impossible to say. I do know that SQL Server rarely does things at random 🙂
Gaurav 2014-05-20 07:32:20
Hey Kiwi – Thanks for the response again ! If next time same issue comes, you want to recommend to save some data so that we can do an analysis ?
Edward Norris 2014-05-01 15:25:54
Have you tried adding LastUpdateDateTimeStamp to the IX_ISRM_Ticket_TicketFolderID index in the includes?
Gaurav 2014-05-02 05:27:57
Edward – Will we get any benefit by adding this Index ?