Combining multiple fields into one in a non oltp closer to dw scenario

GokhanVarol 2013-07-06 21:26:37

I am producing a copy of our data for a linking process. I want to avoiding having too many indexes. The linking process will need a single key to join, store, send it to customers whereas the source system has composite keys of 4-6 columns.

It would have been perfect if I could have created a single key which could contain a kind of compressed version of the composite key which can be packed / unpacked easily without any loss and if that would have fit into a bigint.

It seems like a bigint is not going to cut it, because there are not enough bytes, I would need at least a 12 byte structure to store the data. I would like the joins to be as efficient as possible under these conditions. The key I am planning on creating is not nullable.

I created a test case where I inserted 10 million rows into a table which has a unique clustered index on the key column and later joined the table to itself to get a count of the table and I compared the durations. I created 5 different tables with different data types (int, bigint, binary8, binary16, varbinary16) and I repeated the test 20 times for each table.

If this test is fair then it seems like the binary16 has close to 30% overhead over bigint and binary8 has close to 20% overhead. Packing and unpacking from binary data is easy and cheap and this can help avoid secondary storage, so it seems like it will be worth it.

Any suggestions or any surprises I may run into if I choose a binary(12) not null column to store a composite key?

Data type test results file
Image