Looks like a "Optimizer injected sort" issue

mike011420 2014-08-17 03:07:46

I have this plan. It looks to me like this query could use an index to optimize the sort "to support the merge operation". That's great but what would that look like?

Here is a representative query:

SELECT
d.col1
FROM #Temp AS d
INNER JOIN configtable AS c
ON parameterName = 'Name'
INNER JOIN
AuditTable AS at
ON at.databaseID = c.ParameterValue
AND at.col1 = d.col1
LEFT OUTER JOIN
multiValueTable as mvt
on mvt.databaseID = c.parameterValue
AND mvt.col1 = d.col1
INNER JOIN
StatusTable AS st
on st.databaseID = d.databaseID
AND st.col1 = COALESCE(at.col2, mvt.col2)
AND st.col2 = 'XXX'
WHERE  COALESCE(at.col3, mvt.col3) IS NOT NULL

This query can take up to 12 hours to run. Any hints?

Thanks

Mart 2014-08-17 10:14:53
Hi there

It would be easier to answer more definitively if the actual plan un-anomilised was provided. As we've not got that I'll tell you what I'd begin looking for, this of course can change as you go on but it'll give you a start.

The merge join (right outer join) is getting data from the top clustered index scan on table 3 and from the path below. The sort operation at 41.3% is using two columns, one from table1 and one from table 5. These are already sorted by operations later down the path.

In order to affect the sort in question you'd need to get the data in the correct order by another method, which as you mention, can be by having it in the correct order already by using indexes. As the lower section of this is a little more complex that the upper I'd suggest beginning by seeing if you can add an index to Table3, this would remove the clustered index scan and replace it with an index operation, assuming you can alter the order this data is provided, to match that of the path below, you may be good to stop there. However, table 1 is having a table scan done so that would probably benefit from an index, sorting the data in the order to match that required for the joins.

Finally, looking at the query, I expect the WHERE clause may be better written something like this (not tested) – I think I may start with this to see what affect it has on the estimated plan.

WHERE at.col3 is not null
OR
(at.col3 is null and mv.col3 is not null)

Remember altering the order of the data within indexes can have an affect on other queries so should be carefully tested.

All this is just theoretical and I may well be wrong about parts of it so please use it as a start and possibly to rule out methods of optimizing this one rather than the exact process that must be followed.

Good luck

Mart

mike011420 2014-08-17 15:55:30
Thanks I appreciate the input. Is it possible for you to give me an nudge about "seeing if you can add an index to Table3". This has been frustrating me to no end. I cannot seem to find a source that will explain how to do this either at a level I can understand (Index Design Technet) or by example given in other blog posts which are not complex enough. I have done the "shotgun" approach and cannot seem to find the right combination that will give me the seek I am looking for.

Thanks

Mart 2014-08-18 13:00:42
Hi Mike

If you look at the plan you have you'll see the actual column names (in the plan posted these are changed so I can't say what they are really) that are being returned by the current clustered index scan. These four columns are needed in the new non-clustered index. What order and level I'll come back to.

The Sort operation below is sorting on table 5 column 5 and table 1 column 6, these are the ones I expect you should consider (as they're being called by the merge operator) when sorting the data in the new index.

Looking at the Join Diagram:
Table1.col6 = table2.col8
table5.col5 = Table2.col7

Table3.col7 = table5.col5

The index for Table3 should therefore (fingers crossed) be on Table3.Col7 and include Col2, Col3 and Col8 – This, I think, will cover that part.

Looking further into this I'm not sure how positive this will be but definitely worth a shot to see how it changes things.

Have you tried the where clause change, currently I'm guessing it's not sargeable ie indexes can be used correctly as the optimize can't determine what the value is.

Let me know how you get on or if you get a new plane:)

mike011420 2014-08-18 20:20:18
Well it was no go in ridding of the scan and replacing with a seek. I had luck getting it to use the non-clustered index but I was not sure if I would gain anything. Did not seem like it though I did not actually run the query.

I am frustrated enough now that I am willing to get your the non-anonymized execution plan if you want it or think it will help.I don't see a way to attach a file in these comments.

Mart 2014-08-18 21:50:17
You can attach a file in the answers section, not sure if you can add one here. The real query and plan would be good I think.
Mart 2014-08-19 06:45:32
Cool got that.