Variance in actual vs. estimated rows and poorly performing subquery

AndrewDP23 2017-09-28 00:42:29

I have a query block that takes 30 minutes as a subquery compared to 31 seconds in a INSERT INTO #TEMP followed by SELECT * FROM #TEMP in the main query.

I can't work out why inserting this into a temp table and selecting from that works so much better than including this query text as a subquery.

My first action when investigating this was to look at the huge rows returned from the table scan on Object1, and consider that statistics may be out of date. The estimated rows returned is 131354, however I've just updated statistics for all tables in the database using sp_updatestats and re-ran the subquery version, and now get 29 minutes 35 seconds.

Am I correct that the variance between actual and estimated rows returned from that table scan is likely the cause of the poor performance? Are there any suggestions on what could be done to improve this?

The code structure for the problem block is below.

SELECT DISTINCT 
    ce.Column1, ce.Column2, ce.Column3, ce.Column4,
    sum(ce.Column6) AS efts 
FROM Object9 ce
JOIN Object1 otd ON otd.Column8 = ce.Column7
WHERE ce.Column46 < @reportingyear 
    AND otd.RecordType IN ('L')
GROUP BY ce.Column1, ce.Column2, ce.Column3, ce.Column4
 
UNION ALL
 
SELECT DISTINCT 
    ce.Column1, ce.Column2, ce.Column3, 0 as Column4,
    sum(ce.Column6) AS efts 
FROM Object9 ce
JOIN Object1 otd ON otd.Column8 = ce.Column7
WHERE ce.Column46 < @reportingyear 
    AND otd.RecordType IN ('S','P')
GROUP BY ce.Column1, ce.Column2, ce.Column3, ce.Column4

The subquery plan section is:
alt text

When I dump this into a temporary table and select from this it takes 31 seconds.
alt text

I have confirmed the LEFT JOIN to this subquery returns the same number of rows in both versions (3,296), so there's no altering happening to the final result sets.

Note, in the separate temp table query I had MAXDOP(1) turned on to try to get that query closer to the subquery version by removing the (beneficial) parallelism that was occurring.

The attached plan query1-plan1-anonymized is the temp table based version.