Additional Tuning Opportunities
At this point, I have the query tuned from 3 minutes to 2 minutes. The table being summarized contains 81 fields per record. I tried a test and partitioned the data, then added a columnstore index to each partition, which brought the query down from an average of 2 minutes to less than 20 seconds.
Looking that this original query, can anyone provide some suggestions as to things to look at to try and improve the processing time, other than the partitions and columnstore indexes?
A columnstore compresses data much better (less IO), and allows you to read only the columns used in this query (yet less IO). If you look at STATISTICS IO, you should see a huge decline in the logical reads.
A columnstore also enables the query to execute in batch mode, which make things go orders of magnitude faster.
(I don't think that the partitioning you added did anything for performance)
I just wanted to make sure that neither you nor other people get the impression that partitioning by itself is a tuning tool. That appears to be a common misconception so I just wanted to make that clear. 😉
The reason for the partitioning was due to the ColumnStore indexes and trying to prevent/minimize rebuilds or drop/create(s) in an OLTP database. Makes sense, as prior to SQL Server 2017, the table with the non-clustered column store index became "read only", so the partitioning helped in only having to rebuild a smaller set of data. When we first started the columnstore tests, it was with SQL Server 2014.