Different Query Plans from SQL 2008R2 to SQL 2012

Paul Bell 2014-04-14 22:58:34

I have a pretty bad query generated by MS Dynamics CRM 2011 RU16 (see attached).
There is not a way to edit the query, add hints, or any optimization on the App side.

We recently upgraded to SQL 2012 and are having some severe performance problems with this query.
See attached Actual Query Plan from SQL 2008R2.
See attached Actual Query Plan from SQL 2012.

SQL 2012 won't make it parallel unless I include Trace 8469 on this query. Even with that flag, the query returns in 7 seconds (vs 30+) where on the 5 year old server (same # cores), it returns in <2 seconds. I've been trying to create a new plan in SQL 2012 to use the query trace 8649 for this query (without startup trace 8649) but have been unable to make it stick. Am I barking up the wrong tree? What other tactic can i use to make it go parallel? I've also included sp-Config output from both servers (attached as well).

Moved from comments:

I've updated statistics and ran: exec sp_recompile N'cda_employeeExtensionBase'; exec sp_recompile N'cda_employeeBase'; …no change in query performance. I enabled trace 4199 globally, recompiled … no change in query performance. 8649 seems to be the only one that does the trick to get the parallelism juice flowing! I set the MAXDOP and Parallelism cost back to the same, all traces off, recompile, no change.

SSRS is running on this DB server so I reduced MAX memory, and left min memory high to reduce the amount of data SQL purges from memory. I will set them back now to the R2 values and see if that helps. Reading up on parameter sniffing at the moment.

I've applied CU8 with limited improvements.
Here are the SQL stats as requested.

SQLkiwi 2014-04-15 03:13:38
I moved your comments to the question to assist future readers (all relevant information in one place).
SQLkiwi 2014-04-15 00:38:14
The root of the problem is that SQL Server 2012 finds a cheaper-looking serial plan than SQL Server 2008 R2 was able to. You can confirm this by looking at the estimated versus actual rows at a key point in the 2012 plan:

Plan fragment

The optimizer expected to have to process only 749 rows here to meet the ultimate goal of a maximum of TOP (51) matches. The query in fact only produces 31 rows in total, so the operators further down the tree end up processing their entire potential input set, rather than the small number of rows expected by the optimizer. There is a supported way to disable this Row Goal behaviour, using TF 4138, but applying this globally would likely regress other important plans. I am holding this option in reserve at this stage.

Most likely, the new plan is due to a new optimizer trick added in 2012. There is no supported way to turn this feature off, and although I know how to do it, I am reluctant to share it for supportability reasons. I would need a statistics-only copy of the three tables involved to validate my reasoning on this.

The standard mechanism to prevent a plan regression when upgrading to a new version is to use a plan guide. You have the old plan from SQL Server 2008R2, so a plan guide would be a perfect solution. However, the query uses LIKE which cannot be parameterized, meaning we cannot create a single plan guide for all future queries of the same pattern. This is most unfortunate. I doubt you will want to create a plan guide for every possible name combination!

You should be careful to set the system configuration options on 2012 exactly as they are on the R2 instance. There are differences in:

  • The cost threshold for parallelism (5 vs 4)
  • Max degree of parallelism (4 vs 8)
  • Max server memory (120,000 vs 112,640)
  • Min server memory (65,536 vs 102,400)

All these can affect plan selection.

Note: TF 8649 is undocumented, you cannot use it on a production system without invalidating your support agreement.

There are no parameters to sniff in this query (and there can't be, hence the plan guide problem).

Paul White

Joe O'Brien 2014-04-15 13:43:28
Notice how the 2008R2 plan received a full optimization, while the 2012 plan had a timeout….(reason for early termination of statement optimization=timeout)
SQLkiwi 2014-04-15 17:56:02
Right. 2012 finds a much cheaper-looking serial plan earlier than 2008 R2, and the timeout on exploration is set lower as a result. This is a consequence rather than a cause.
Paul Bell 2014-04-15 18:15:55
I didn't see the timeout last night. We ran with TF8780 and did DBCC FREEPROCCACHE. The queries got parallel plans for about 30 minutes, then the bad plans struck again!

After CU8, 4199 behaved differently (in my view) and went into production about 4am today. Should I have any reservations about that?

Continuing to monitor…

SQLkiwi 2014-04-15 18:34:11
If you're running 8780 in production you are absolutely insane. Throwing dangerous undocumented trace flags at the problem is not a sensible course of action at all. If you want to work towards a proper solution, provide the stats copy of the tables I asked for and I will try to help you.

While I understand you have a real issue to fix here, I want no part of this sort of dangerous mucking about on a production system. If your performance problem is so serious you are resorting to this sort of thing, you should have rolled back to 2008 R2 while working with your vendor and/or Microsoft CSS to find a supportable 2012 solution.

SQLkiwi 2014-04-15 18:45:45
On the other hand, if you're saying that applying CU8 to 2012 SP1 and enabling 4199 has resolved your issue, that's fine 🙂 Assuming you removed 8780.
Paul Bell 2014-04-15 20:01:47
Ha! Yes, sorry for my lack of clarity. 8780 + 4199 in DEV – 8780 removed, things worked great with just 4199, ONLY 4199 implemented in production.
SQLkiwi 2014-04-15 20:09:39
I'm very glad (and relieved) to hear that.
@SQLTrooper 2014-04-15 14:07:26
Being a SQL Server DBA and a Dynamics AX Administrator, I'm first going to ask why you are not going with the recommended configuration of Max Degree of Parallism = 1 – at which time cost threshold for parallelism becomes irrelevant. That is the same recommendation from Microsoft for CRM. I also know that the Dynamics products do a ton of CRUD activity so fragmentation does become an issue quickly – what does your maintenance approach look like? I personally rebuild/reorg indexes and update stats nightly using Ola's scripts. If you don't, any cached plan can become problematic. To do so weekly would burden my resources and bloat my trans log – thus my approach.

BTW, it's hard to not notice the index naming convention – they were generated using the DTA. You realize that Dynamics will not preserve those indexes? If CRM is anything like AX, you have to do all your index tuning inside the product via the AOT, else you will lose those indexes. I think you need to take each query individually and tune them specifically – not sure if your version allows you to use included columns (mine does not).