Indexing Advice required

PRO 2014-05-01 22:10:44

I have a table that does not have any indexes but it is heavily used. I am trying to come up with a good indexing strategy for this table.

Here is the structure of the table and I am using SQL 2008 R2 EE

Create table Table1
(tpropid nvarchar(4000),
tSQLFilter nvarchar(max),
tConditionid nvarchar(4000),
nObjectType int,
nFilterId int,
nBuild bit,
dtDateUpdated datetime)

I have around 300,000 rows in table with 154,000 datapages. 80% of time, table is read, 10% of time, new rows are added, remaining time it is updated.

The above table does not have an identity column or a column that uniquely defines a row.

Although nObjectType is not unique, here is distribution of values

objecttype no.of rows
20005 110,000
2 2005
20014 300
20017 270

Max length of tPropId column = 1993
Max length of tConditionId column = 293

Typical select on this table would be…

Select tSQLFilter From table1 With(Nolock)
Where tPropID = @nPropertyIDs and tCondition = @nConditions and nObjectType = @nObjectType

With the constraint that I cant change table structure in any way, here is my thought

Create clustered index PK_Table1_nObjectType
on Table1(nObjectType)

create nonclustered index IX_Table1_Condition_Includes
on Table1 (tCondition)
include (tSQLFilter)

I decided to create a clustered index because I believe every table should have a clustered index. Because of datatypes, nObjecttype is only viable index

For non clustered index, although I wanted to include index on tPropId, because it can exceed 900 byte limitation on index, i cant use it in my index.
For tCondition column, although declared as 8000 bytes, the max it has ever taken is 600 bytes, so I am willing to take risk that it will not exceed.

since tSQLFilter is nvarchar(max), i have added this as a include column.

I have done couple of testing and it has significantly decreased page reads, but i would like to listen to your opinions and advise?

Dan 2014-05-02 07:47:25
Hi Pro,

I wonder if compression would reduce the read effort for you in this case? Technically that is a table ALTER though, so I'm not sure if that's an option in this case? Compression would reduce the Disk IO for this table. I think given your column types you'd have to look at page compression.

Dan

@SQLTrooper 2014-05-02 14:31:18
I may start by focusing on the table design – looks like some of those data types are unnecessary. Then, pick your clustered index based on data access patterns. Think of the phone book when you think about a clustered index – to add/update or retrieve a new person via last name is pretty simple, so last name makes sense. I do agree with the compression suggestion – I typically go with Page as well. CPU is typically not our bottleneck – memory and/or IO is, so compression is quite useful – let your statistics help you make that determination. Once you have all that completed, do some analysis on the execution plan of your common queries and create non-clustered indexes as needed.
PRO 2014-05-05 16:57:54
Thank you guys for your suggestion. Unfortunately, our CPU usage is very high so I can't enable compression now. I discovered that if I include tPropId column in the include list, SQL will avoid a key lookup. I didn't think that way. But, later after pondering for while, i made sense. Why would SQL do another key lookup operation if tPropId is also included in the index leaf level.
Sujai Karnam 2014-05-08 14:04:52
Clustered index key should be in such a way that it uniquely identify a row, basically with high selectivity. As part of your problem description, you have mentioned that there could be as many as 110K for nObjectType 20005. Having a Clustered index key on such a column (although SQL Server will add an uniquifier to it) can cause

  1. Huge data skewness in statistics, thus poor estimates while building execution plans, thereby bad plans
  2. Key lookups if exists would highly suffer since it has to now use the uniquifier
  3. Too much of fragmentation and page splits when you insert/update/delete data
  4. Range of X/IU locks on clustered index cascading to non clustered indexes. If range goes beyond 5000, there could be lock escalations
  5. there are many more…the list goes on…

Your clustered index key need not necessarily be part of your query. If nFilterid column has high selectivity, choose that as a clustered index or do consider composite key.

@SQLTrooper 2014-05-08 14:09:14