why does SQL Server ignore index?

Henrik Staun Poulsen 2017-04-11 13:26:00

In our data warehouse, we have a fact that looks like this:

CREATE TABLE [fct].[VMP10MinData](
    [VMP10MinDataID] [bigint] IDENTITY(1,1) NOT NULL,
    [DT1TurbineID] [int] NOT NULL,
    [DT1TurbineEventID] [int] NOT NULL,
    [TTimeStamp] [smalldatetime] NOT NULL,
    [TTimeStampUTC] [smalldatetime] NOT NULL,
    [Gen_RPM_Max] [numeric](9, 1) NULL,
    ...
    [Grd_InverterPhase1_Temp_AvgA] [smallint] NULL,
    [UpdatedDate] [datetime] NOT NULL,
    [IsCorrected] [bit] NOT NULL,
    [TRATransformJobID] [int] NOT NULL,
    [TTimeStampLocal] [smalldatetime] NOT NULL,
    [UpdatedID] [bigint] NULL,
CONSTRAINT [PK_VMP10MinData] PRIMARY KEY CLUSTERED 
(
    [DT1TurbineID] ASC,
    [TTimeStamp] ASC
)WITH (DATA_COMPRESSION = ROW ON PARTITIONS (207 TO 253),
DATA_COMPRESSION = PAGE ON PARTITIONS (1 TO 206)) ON [PS_FctVMP10MinData1]([TTimeStamp])
) ON [PS_FctVMP10MinData1]([TTimeStamp])
GO
 
CREATE UNIQUE NONCLUSTERED INDEX [IX_DT1TurbineID_PCTimeStamp_TTimeStamp] ON [fct].[VMP10MinData]
(
    [DT1TurbineID] ASC,
    [PCTimeStamp] ASC,
    [TTimeStamp] ASC
)WITH (DATA_COMPRESSION = ROW) ON [PS_FctVMP10MinData1]([TTimeStamp])
GO
 
CREATE UNIQUE NONCLUSTERED INDEX [IX_DT1TurbineID_TTimeStamp_TTimeStampUTC] ON [fct].[VMP10MinData]
(
    [DT1TurbineID] ASC,
    [TTimeStamp] ASC,
    [TTimeStampUTC] ASC
)
INCLUDE (   [DT1TurbineEventID],
    [Amb_LaPM_ActSectorInPeriod],
    [Amb_WindSpeed_Avg],
    [Amb_Temp_Avg],
    [Grd_Prod_PsblePwr_Avg],
    [Grd_Prod_Pwr_Avg],
    [Grd_Prod_Pwr_InternalDerateStat],
    [Grd_Sets_ActPwr_Source10Min],
    [Grd_Sets_ActPwr_ReferenceValue10Min],
    [HCnt_Avg_Tot],
    [HCnt_Avg_Gen1],
    [HCnt_Avg_Gen2],
    [HCnt_Avg_GrdOn],
    [HCnt_Avg_GrdOk],
    [HCnt_Avg_TrbOk],
    [HCnt_Avg_AlarmAct],
    [HCnt_Avg_Run],
    [HCnt_Avg_SrvOn],
    [HCnt_Avg_AmbOk],
    [HCnt_Avg_WindOk],
    [HCnt_TotAccumulated_Gen1],
    [HCnt_TotAccumulated_Gen2],
    [HCnt_TotAccumulated_GrdOn],
    [HCnt_TotAccumulated_WindOk],
    [Nac_Direction_Avg],
    [Prod_TotAccumulated_TotActPwr],
    [Sys_Logs_FirstActAlarmPar1],
    [Sys_Logs_FirstActAlarmPar2],
    [TTimeStampLocal],
    [VMP10MinDataID]) WITH (DROP_EXISTING = OFF, ONLINE = OFF, DATA_COMPRESSION = ROW) ON [PS_FctVMP10MinData1]([TTimeStamp])
GO
 
CREATE UNIQUE NONCLUSTERED INDEX [IX_TTimeStamp_DT1TurbineID] ON [fct].[VMP10MinData]
(
    [TTimeStamp] ASC,
    [DT1TurbineID] ASC
)WITH (DATA_COMPRESSION = ROW) ON [PS_FctVMP10MinData1]([TTimeStamp])
GO
 
CREATE UNIQUE NONCLUSTERED INDEX [IX_UpdatedID_DT1TurbineID_TTimeStamp] ON [fct].[VMP10MinData]
(
    [UpdatedID] ASC,
    [DT1TurbineID] ASC,
    [TTimeStamp] ASC
)
INCLUDE (   [UpdatedDate]) 
WHERE ([UpdatedID] IS NOT NULL)
WITH (DATA_COMPRESSION = ROW) ON [PS_FctVMP10MinData1]([TTimeStamp])
GO
 
CREATE UNIQUE NONCLUSTERED INDEX [IX_VMP10MinDataID_TTimeStamp] ON [fct].[VMP10MinData]
(
    [VMP10MinDataID] ASC,
    [TTimeStamp] ASC
)WITH (DATA_COMPRESSION = ROW) ON [PS_FctVMP10MinData1]([TTimeStamp])
GO

We have seen this issue previously, when we rebuild one partition, and the statistics is invalidated, because of different sample rates.
But I thought it was fixed with this CU.

The query want to do two complete table scans, despite the good indexes.
Why does it prefer such a brute force plan?

Best regards,
Henrik Staun Poulsen

link textlink text

Aaron Bertrand 2017-04-11 19:50:54
The effective predicates for the two halves of the union are, effectively, MinDataID > 6.8 billion and MinDataID <= 6.8 billion, give or take. This is the entire table, and neither narrows down the data enough where a seek makes sense - the number of rows in each half are well beyond the "tipping point" (and clearly can't take advantage of partitioning, because MinDataID and UpdatedID are not part of your partitioning key). The index IX_UpdatedID_DT1TurbineID_TTimeStamp might be able to use, if only MinDataID were part of the key (or, in fact, the leading column in the key). Have you considered the missing index SQL Server suggested?
Henrik Staun Poulsen 2017-04-12 07:11:49
link textHi Aaron,

Thank you very much for answering this question and your talk at SQLBits.

<< effectively, MinDataID > 6.8 billion

Unfortunately this is not correct. I have two watermarks; a) new rows where VMP10MindataID between 6.80 billion and 6.81 billion, b) updated rows where I want 1500 rows between 4.3 billion, but only if VMP10mindataID is less than 6.8 billion.
I do understand that this might result in two sets of key lookup. This normally performs reasonably well.

Best regards,
Henrik

Aaron Bertrand 2017-04-12 10:46:57
This is why I said effectively. Whether you have additional predicates or not, because there is no index that adequately supports your predicates, the union has to perform a full clustered index scan for both halves of the union, and SQL Server doesn't seem to think it will be able to filter any rows out (or use partition elimination). We might get more detail about this if you generate an actual plan from within Plan Explorer; with an estimated plan we can't see much at all.
Henrik Staun Poulsen 2017-04-12 12:03:01
Hi Aaron, I have added the current exe plan, which looks much better, and also finishes in 10 seconds. But the statistics were updated in between, and I do not have a copy of the old statistics.
Do you have any guesses about why SQL Server would pick such a brute force plan, instead of the usual plan?
Aaron Bertrand 2017-04-12 12:17:13
My guess is that it was similar to the ascending key problem – stats had no idea how many rows could be filtered out, possibly because the values in your predicate were outside the range in the at-the-time histogram. Tough to be certain now unless you get back into this scenario again.
Henrik Staun Poulsen 2017-04-12 12:20:36
But normally when you query outside the range, then SQL Server will estimate a wooping 1 row, and prefer keylookups? It is as if it does not know that these index were present…
Aaron Bertrand 2017-04-12 12:24:58
With a single where clause, maybe. Your predicates are slightly more complex than that, and nowhere in your original plan does the optimizer estimate a single row for anything. As soon as it gets beyond a certain threshold, due to the huge number of columns that are missing from the appropriate index, the tipping point takes over and a scan is used. Notice how expensive the key lookup is in your updated plan – this is not an optimal index for this query, either. It's great at eliminating rows, but not at supplying the data.
Henrik Staun Poulsen 2017-04-12 13:53:45
yes, keylookups are expensive. The estimated cost is 1882, as opposed to the table scan cost of 641176 🙂
This situation occurs quite regular, and after Easter holidays (that starts in 10 minutes) I'll have a bunch of stats logged to disk, and then I'll check to see if there are any clues as to why it picks such an expensive plan.
Thank you very much for your help.
Henrik Staun Poulsen 2017-04-24 13:48:48
I've been investigating this problem further, and yes, it is a Ascending key problem, but it is also something else.

I've added two new plans, that show a huge difference between estimated and actual plan. The estimated plan warns about Residual IO, where the Actual plan is just fast.

I also note that the plan looks more complicated when I rely on implicit casting, as opposed to explicit casting.

Can you explain the difference between estimated and actual plan?