I can see the thick black line where I'm doing a clustered index scan on my Estimate table — I'm guessing this is where the pain point is. I'm not sure what index(es) are missing to alleviate this.
I see the huge clustered index scan (cost-wise), but don't know how to procede.
Thanks,
Rob
PrimaryEmployeeID as the index column and INCLUDE(SecondaryEmployeeID).
That looks like what the Lazy Spool is caching and the sort before the NL join is ordering for.
If you do create the above index, it will create statistics automatically but you will still need to update them on a regular basis.
Hope this helps…
that being said if you can't manage a result table then you should definitely simplify this query. I would start by doing clean selects and joins and perform the transformations in steps so that you can cleanly join sets together. one step could be to join the orders and estimate table so that you get a more appropriate join that doesn't spool and reduces the set more efficiently. another step would be to transform data in a clean select so that you don't overly burden the optimiser when joining to other tables. Are your users really consuming 1.7 million rows? perhaps there is scope to reduce this set.
another element that is slowing this down is the sorting which again is based on nested case expressions. can the consuming application not handle the sort instead?
one last thing you might want to look at is correlated statistics to improve your estimates.
Wayne
CREATE INDEX idxEstimate_PrimaryEmployee ON dbo.Estimate (ID) INCLUDE (PrimaryEmployee, SecondaryEmployee) WITH (DROP_EXISTING = ON);
Despite the name, that index appears to be on (ID, PrimaryEmployee) or possibly on (ID) INLCUDE (PrimaryEmployee).
The modified index would also remove the Key Lookup on the preceding join in the plan.
The query is very large. The only way you are going to get a reasonable query plan for it as it stands is to provide "obviously best indexes". That means providing nonclustered indexes that cover the query (avoiding lookups) and which can provide useful ordering.
If the query can be broken up into several parts, I would strongly urge you to consider that. Not only will it make life easier for the optimizer, it is likely to make future maintenance easier too.
SELECT object_name(sp.object_id) as object_name,
name as stats_name,
sp.stats_id,
last_updated,
rows,
rows_sampled,
steps,
unfiltered_rows,
modification_counter
FROM sys.stats AS s
CROSS APPLY sys.dm_db_stats_properties(s.object_id, s.stats_id) AS sp
WHERE sp.object_id > 100
ORDER BY object_name;