Sum on Fact table slowed down by Group By

Clive 2017-06-15 12:35:42

We have a bunch of queries over a time series which were taking over 1 minute to process. The Fact table contains just over 4m rows. Adding a new index on Date and CompanyKey and including the columns being summed resulted in the slowest query now running in 3s. However, we have some queries left over which group on country (fact table joins to DimCompany on CompanyKey).

If I remove the group by, the query runs in a few seconds. I realise the select distinct also slows things down, but I have ideas for that. I'm more focused on the group by performance at this point.

Aaron Bertrand 2017-06-15 12:42:50
Without even looking at the plan, I can draw some analogies to illustrate why it is costlier for SQL Server to aggregate over grouped data than total data. Imagine you are on stage in an auditorium, and someone asks you to count all the people in the room. Easy, right? Now, count all the people by eye color. Closer to data: count all the Smiths in the phone book. Now count all the smiths but group by the last digit in their phone number. Without getting all the people with the same eye color to sit in the same row, or sorting the phone book by last name and then the last digit in their phone number, it's a lot more work, isn't it?