I'm struggling with this query and it appears to be particularly slow due to my use of RANK() OVER which I need later in order to pick the highest row (within each partition) from two subsets of data.
Looking at the query plan, there are massive cardinality issues indicated by the large differences between estimated rows and actual rows.
My gut feeling is that there is nothing wrong with the query itself, but perhaps the way the statistics and/or indexes are set up on the tables. The statistics are updated daily so I know it's not a case of outdated stats.
Would appreciate any help anyone can give on this.
Edit: Based on help and advice given so far, I've attached non-anonymised execution plans for the query before and after the change that speeds it up. Here's the one with the function call as part of an INNER JOIN. Here's the one with the function call in the WHERE clause instead. The latter runs much quicker and yields the same query results. It appears that removing the RID lookup that Sujai Karnam and SQLkiwi mentioned is problematic improves the performance massively.
By Neo 31 asked Feb 12, 2014 at 01:09 PM
Always hard to give any decent advice with an anonymized plan...
But, remove the table valued function. A)dump the results of that into a temp table and use that in the query if using 100% of its output B) find another way to to its activity if not.
The problem is TVF always estimate at one row, here it is returning 564 rows.
Table2 is also giving a relatively poor estimate, the error of which is being magnified up the tree.
By Dave Ballantyne 263 answered Feb 12, 2014 at 01:56 PM
Although the RID Lookup operator has a cost of 0.1%, the number of iterations (1,185,988 times lookups)is very high. The lookup is Column 18 of Table 14. If the Table14.Index9 is altered to include the Column 18, there should be good improvement.
By Sujai Karnam 48 answered Feb 12, 2014 at 04:52 PM
Thanks for all your help so far. I've looked at the suggestions and eliminated the columns and tables mentioned from the query by removing them completely and the query still runs very slow. As an exercise, I have whittled it down to the bare minimum so that it still runs very slow, but with only the two JOINs that cause it to take long. Without either of these (even with everything else included), it runs quickly.
The non-anonymised query analysis (see here) shows an horrendous number of rows being outputted, but it takes a long time in SSMS before it even starts outputting the rows to the 'Results' window. What I found is that the function call simply returns a single-column list and the purpose of the JOIN is to filter the results on that list. It occurred to me that this should actually go in the WHERE clause.
So, the resultant query is
This is much quicker although not perfect - it still takes several minutes, but that's better than over an hour.
By Neo 31 answered Feb 13, 2014 at 06:12 PM
2.The Statistics of the table is "F_V_IrZeroDelta" is either outdated or is based on a small sampling
By Sujai Karnam 48 answered Feb 18, 2014 at 03:31 AM