This simple query is currently using the wrong index when I run it on my reporting server. Instead of TransactionsTMernbrIDTDate, it is using TransTDateTmernbr
The result is a query that takes almost a minute instead of a fraction of a second to complete.
Index Stats are completely up to date.
When running same query on Dev server which has a backup a few weeks old, then the right index is picked by sql server
By mauripop 1 asked Jan 04 at 12:11 AM
Hmmm, I just realized that a lot of information I asked you for is already included in the file you posted. (I often forget how good plan explorer actually is :D)
So, here is what I see and why the optimizer picked the plan you get. After that, I'll show how you can probably fix this.
The statistics for column t_mernbr are included as part of the statistics on index TransactionsTMernbrIDTDate, because t_mernbr is the leading column of that index. For the value in your query (7000043445), no specific entry in the statistics histogram exists. The value falls between two steps, for 7000038795 and 7000053075. The values SQL Server will use are the range values of the latter step. These show that at the last statistics update, there were 516,874 rows between these two values, for 39 distinct values. SQL Server assumes a uniform distribution within each interval, so each of those 39 values is assumed to have exactly 13,205 rows. SQL Server also always assumes that you would only query for data that exists, so it assumes that the value 7000043445 exists in the table - and as mentioned, that there are 13,205 rows with this value.
Based on that expectation, it has to choose between two strategies. The one you see on your test system is to seek index TransactionsTMernbrIDTDate to quickly find the first row with t_mernbr equal to 7000043445, then read all 13,205 of them, sort them by date, and then present you the one row with the lowest date value. This sounds cheap based on the seek, but it still has to read 13,205 rows, AND it has to sort them (which costs a lot of cpu and memory). The alternative option is to use an ordered scan of index TransTDateTmernbr. This index is sorted by date, so in this case it can simply start at the beginning until it hits the first row that has the correct t_mernbr value, return it, and stop immediately. No need to read the rest or to sort anything. But at the price of having to read rows for "random" t_mernbr values until it finds the first match.
At this point the optimizer does some simple math. The expected number of matching rows is 13,205. The total number of rows in the table is 96,957,900. That means that 0.136% is a match. Again assuming that all data has a uniform distribution, this means that the expectation is to read 7,342.5 rows before it finds the first match.
At this point the comparison between the two plans becomes easy. The first plan has to read 13,205 rows from an index that includes a thousand columns (and is therefor quite wide - so 13,205 rows will be a lot of pages) and sort them, before the query can return data. The second plan has to read only 7,342.5 rows from an index that is a lot narrower (one indexed column, one included column, and the clustered index reference that every index needs), and does not need a sort. Easy choice!
Now when you run it, the choice backfires. The root case is a bad assumption by SQL Server. It assumed that the value you are looking for exists, but in reality it doesn't. The plan it chose, to start at the earliest date and stop as soon as it finds a match, ends up reading all 96,957,900 rows before being able to return nothing. The plan you have on Dev uses the index on t_mernbr to find all matching rows, finds nothing, returns nothing to the Sort, that then also finishes extremely quickly and returns nothing.
So now you understand why the "bad" index was chosen. You may also wonder why the same does not happen on Dev. The answer is: even if both the data and the indexes are the same (which you should definitely verify because it's not guaranteed), the statistics most likely are not. And if the statistics feed different data to the optimizer, then the decision process remains the same but the numbers will be different. And in this case the result was that the other plan was estimated to be cheaper.
Of course, what you really want to know is how to fix this. In many cases this is a hard problem, because we have no control over which values will be included in the statistics histogram and which values will be put in interpolation ranges. We also cannot tell SQL Server to stop assuming that the value we are looking for exists. But in your specific case, there luckily is (probably) an easy fix.
Index TransactionsTMernbrIDTDate is currently defined as an index on t_mernbr, with included column t_date. You can change this to be an index on t_mernbr, t_date (so both columns are indexed, with t_mernbr as the leading column). For this specific query, the optimizer now knows that all rows with the same t_mernbr are arranged within the index by date, so it can use the index to find the first matching row and immediately return it. That strategy beats any other strategy so regardless of statistics and estimates should always be chosen. I expect that if you change the index as I recommend, it will always be used for this query and results will always be instantaneous, regardless of the value passed in.
Index changes do always have consequences, so let's see what the price for this change will be.
By Hugo Kornelis 211 answered Jan 04 at 12:07 PM
Hugo, I am very thankful and impressed by your very thorough and clear explanation.
I have done as you suggested and your prediction is exactly what happened. After dropping and recreating the index with the "t_date" column upgraded from included to indexed, the optimizer now chooses the right index and the query is done in milliseconds instead of minutes.
If I may intrude on your generosity, and in order for me to be able to diagnose this in the future, where exactly can you see this:
I can't find this information myself on Plan Explorer, although it is exactly correct.
By mauripop 1 answered Jan 04 at 11:15 PM