Biggest question is why there is so much difference between the estimated rows and the actual rows for temp tables
We've had to comment out a fourth of the aggregation because it takes so long for it to run. It used to run for 1 hr to 1 hr 10 minutes. After commenting out the one part in both stored procedures (there's about 3 being used here) and rebuilding some indexes and updating some statistics, we've been able to bring the time down to 30 minutes. And we run this for a day at a time, so 1 day will take anywhere from 30 minutes to 40 minutes to run.
I've thought about making a filtered index based on what we're grouping by or at least making an index based on the main field being used, but where we seem to have the biggest disconnect of estimated rows and actual rows is when we insert them into the table where they will be housed for the rest of time.
And the most IO intensive portion is when these rows get inserted, they also get inserted into the index. That part takes a toll and is the most expensive portion of this query.