Why is there a scan on Orders instead of a seek?
CREATE NONCLUSTERED INDEX [RF_IX_Orders_OrderNumber_OrderDate_etc] ON [dbo].[Orders] ( [OrderNumber] ASC, [OrderDate] ASC ) INCLUDE ([CartID]) CREATE NONCLUSTERED INDEX [IX_PackingSkid_DatePrinted_SkidName_etc] ON [dbo].[PackingSkid] ( [DatePrinted] ASC, [SkidName] ASC ) INCLUDE ([OrderNumber])
I don't understand why the estimate vs. actual for Orders in the plan is so far off. It is also way off for PackingSkid. I updated the statistics on both tables and still see the same plan and estimates. Both Orders.OrderNumber and PackingSkid.OrderNumber are non-nullable INTs.
The scan provides the key order necessary for the merge join. Presumably, a hash or loops join would estimate at a higher cost.
…why the estimate vs. actual for Orders in the plan is so far off.
The scan of the Orders table is below a merge join. The second input never produces a row, so the merge join terminates early.
It is also way off for PackingSkid
The estimate here is based on average density information for the DatePrinted and SkidName columns from the multi-column statistics associated with your second index.
SQL Server does not support multi-dimensional histograms. A filtered index or statistics on (SkidName) INCLUDE (OrderNumber) WHERE DatePrinted IS NULL would likely give a more precise estimate for that particular case.
Also, consider expressing the condition on @getRogueSupply so that an index could be used. There are multiple effective ways to do this, including using two separate statement blocks and an IF.
On a more general point, it is usually better to convert inside a CASE expression, e.g.:
...AND @getRogueSupply = CASE WHEN sc.CartName IS NOT NULL THEN CONVERT(bit, 1) ELSE CONVERT(bit, 0) END;
Adding the index you suggested made no difference. I then deleted it and added an index on PackingSkid (SkidName) INCLUDE (OrderNumber) WHERE DatePrinted IS NULL. That fixed the estimate on PackingSkid and Orders and dropped the estimated cost, without changing any join types (it did change the join order though). The weird thing is that it is not actually using the new index. It just wants it to exist. It warns that it is an unmatched index.
Changing the condition on @getRogueSupply didn't result in any change to the plan (it was already doing a seek on the PK to satisfy the join).
Without the new index:
Table 'Workfile'. Scan count 0, logical reads 0 Table 'Worktable'. Scan count 0, logical reads 0 Table 'PackingSkid'. Scan count 1, logical reads 36 Table 'packed_orders'. Scan count 1, logical reads 30 Table 'print_queue'. Scan count 1, logical reads 2 Table 'Orders'. Scan count 1, logical reads 4746 Table 'ShoppingCarts'. Scan count 0, logical reads 2
With the new index (that it doesn't actually use):
Table 'Workfile'. Scan count 0, logical reads 0 Table 'Worktable'. Scan count 0, logical reads 0 Table 'PackingSkid'. Scan count 1, logical reads 36 Table 'print_queue'. Scan count 1, logical reads 2 Table 'packed_orders'. Scan count 1, logical reads 30 Table 'Orders'. Scan count 1, logical reads 4746 Table 'ShoppingCarts'. Scan count 0, logical reads 2
In the second plan, the order is different, but the number of scans and logical reads is the same for each table.
I then tried dropping the filtered index and created it as just a filtered statistic instead: PackingSkid (SkidName) WHERE DatePrinted IS NULL. That resulted in a plan identical to the second one, with estimates almost as good.
I then recreated the statistics object as PackingSkid (SkidName, OrderNumber) WHERE DatePrinted IS NULL. That produced identical results as the single column statistics.
Having the statistics instead of the index gives up a bit in estimate accuracy (although I don't know why), but will be less costly to maintain and doesn't cause a warning in the plan, so I think I'll keep the statistics instead of the index.
Do you (or anyone else) have any other thoughts on this?