Index not being used

The plan is suggesting creation of missing index (ProcessedDate, MessageType).

I have an index on (ProcessedDate, MessageType, FlagBypassdispatcher). Why is it unable to use that?

Does SQL server index null columns?

And I just wanted to add that if I force the query to use the index, IO and elapsed statistics seem much better.

 select * 
 from local_gea_integrationtriggeredevents
     with(index(NC_ProcessedDate_MessageType_FlagBypassDispatcher))
 where processeddate is null
 and messagetype <> 'SerialNumber-Move'
 order by inserteddate

Clustered index scan performs about 8479 logical reads and takes 33 milliseconds

Forcing it to use the index on (ProcessedDate, MessageType, FlagBypassdispatcher), it performs 5 logical reads and takes 4 milliseconds...

So, what's up with SQL Server? Am I missing something?

Thanks.

avatar image By Peachy 16 asked Jul 01, 2015 at 02:47 PM
more ▼
(comments are locked)
10|10000 characters needed characters left

2 answers: sort oldest

You're using SELECT * - this makes SQL Server assume that, while the index can be useful to locate the rows that match the predicate, the required key lookups that would result to pull the rest of the columns would be more expensive. (Sometimes a scan is the most efficient way to get data, even when an index exists.)

This is especially true when estimates are off - notice that SQL Server estimated over 2,000 rows would be returned, and this is why it chose the plan it chose, because 2,000+ lookups would be more costly. It did not know when formulating the plan that you would only actually have 8 rows at runtime.

This could be because (a) the stats really are out of date, or (b) a plan was initially cached based on different parameter values. In your case we can rule out (b) because on the Parameters tab we see that the same parameter value was compiled with the plan. And it may be (a) or it may be partly due to the "inequality effect" touched on here.

If this is a common predicate, you may want to consider a filtered index that has both predicates.

You could also consider outputting fewer columns in the select list if you in fact do not need all columns; this will make using the index more compelling. But first and foremost you should be sure that statistics are up to date so that SQL Server might get the right estimates.

avatar image By Aaron Bertrand ♦ 1.7k answered Jul 01, 2015 at 03:27 PM
more ▼
(comments are locked)
avatar image Peachy Jul 01, 2015 at 05:21 PM

I am trying to post a comment but it won't let me...

avatar image Peachy Jul 01, 2015 at 05:23 PM

The inequality effect is interesting. Any known workarounds for it? I checked the statistics for the Index and it only has histograms based on the ProcessedDate column. Why is that when the index is based on 3 columns? I think this is where it got the approximation of 2,000 rows

I am trying to include the results of dbcc show_statistics but it won't let me post the comment when I do. And I can't attach it as an answer because I can't answer twice.... Here is a small snippet of it:

Name Updated Rows Rows Sampled Steps Density Average key length String Index Filter Expression Unfiltered Rows


NC_ProcessedDate_MessageType_FlagBypassDispatcher Jun 30 2015 8:30PM 332205 332205 190 0.6603539 25.68923 NO NULL 332205

(1 row(s) affected)

All density Average Length Columns


5.192216E-06 7.939098 ProcessedDate 5.19103E-06 21.68923 ProcessedDate, MessageType 5.19103E-06 21.68923 ProcessedDate, MessageType, FlagBypassDispatcher 3.010189E-06 25.68923 ProcessedDate, MessageType, FlagBypassDispatcher, Id

(4 row(s) affected)

RANGE_HI_KEY RANGE_ROWS EQ_ROWS DISTINCT_RANGE_ROWS AVG_RANGE_ROWS


NULL 0 2529 0 1 2015-06-02 14:48:38.140 0 1 0 1 2015-06-08 08:00:00.000 955 21252 721 1.324549

10|10000 characters needed characters left

The plan is suggesting creation of missing index (ProcessedDate, MessageType).I have an index on (ProcessedDate, MessageType, FlagBypassdispatcher). Why is it unable to use that?

Missing index suggestions are hints rather than firm recommendations. When the optimizer attempts to match query predicates to indexes and does not find an ideal one, it may generate a suggestion. These suggestions are opportunistic and not based on anything like a full analysis. There are also cases where internal limitations, or a lack of information, leads to a "missing index" being suggested that is redundant or just not very useful.

In your case, the ideal seek would actually be two separate seeks in one. The first would be on ProcessedDate IS NULL and MessageType > 'SerialNumber-Move'. The second seek would be on ProcessedDate IS NULL and MessageType < 'SerialNumber-Move'.

Unfortunately, the optimizer cannot currently generate a multi-part seek combining an equality with an inequality, unless an INDEX or FORCESEEK hint is used. This restriction is a balance between exploring every possible index usage option, and finding a reasonable plan quickly.

I recommend that people treat missing index notifications as a flag that the indexing on the table might be improved somehow. A proper analysis by the DBA is needed in almost all cases because the suggested index is unlikely to be the best possible, or broadly useful for a range of queries.

Does SQL server index null columns?

Yes indexes can be built on nullable columns, and yes the optimizer can build plans that seek to a null. This was not possible in very early versions of SQL Server, but it has been the case since at least SQL Server 2000.

If I force the query to use the index, IO and elapsed statistics seem much better.

Yes. The optimizer likely considered using the existing index with lookups to fetch the uncovered columns but rejected the idea for cost reasons. The estimated number of rows that match the two predicates is much higher than the actual value. For 8 rows, no doubt the seek + lookup is more efficient. If the optimizer had known that, it would have chosen the seek.

The inequality effect is interesting. Any known workarounds for it?

That blog post is not an entirely complete analysis. I added a comment there, but it is in the queue for moderation. Essentially, the plan in that example was subject to simple parameterization, leading to a dynamic seek plan. Anyway, it is not directly relevant to your question.

checked the statistics for the Index and it only has histograms based on the ProcessedDate column. Why is that when the index is based on 3 columns? I think this is where it got the approximation of 2,000 rows

The statistics associated with an index only ever have a histogram for the leading column of the index. This is by design. The statistics do capture average density for all column combinations, but this is only useful where all predicates in the query are equality comparisons (an yours are not).

The 2,000 row estimate is based on the number of rows for which ProcessedDate was NULL when the statistics were last updated, combined with a separate estimate of the number of rows where messagetype <> 'SerialNumber-Move' based on the single-column statistics for the messagetype column. SQL Server makes a guess at how to combine these two pieces of statistical information.

As Aaron mentioned, the best way to improve the estimates for this query (and so, probably, the execution plan) is by providing a better index, reducing the number of columns returned, and/or providing filtered statistics. The best option for you depends on your knowledge of the wider workload.

For example, the following filtered statistics would probably improve the estimates and produce a better plan, without having to create a new index:

 CREATE STATISTICS stats_name
 ON dbo.local_gea_integrationtriggeredevents
     (processeddate)
 WHERE messagetype <> 'SerialNumber-Move';

Or, you could get the filtered stats for free with a filtered index that would also avoid the sort, like:

 CREATE INDEX index_name
 ON dbo.local_gea_integrationtriggeredevents
  (processeddate, inserteddate)
 INCLUDE (messagetype)
 WHERE messagetype <> 'SerialNumber-Move'

Alternatively, you may decide that an index or forceseek hint is the appropriate action.

avatar image By SQLkiwi ♦ 6.6k answered Jul 02, 2015 at 04:28 AM
more ▼
(comments are locked)
avatar image Peachy Jul 07, 2015 at 04:43 PM

Thank you very much SQLkiwi for your response. It was very helpful.

10|10000 characters needed characters left
Your answer
toggle preview:

Up to 50 attachments (including images) can be used with a maximum of 209.7 MB each and 209.7 MB total.