High number of row estimates
2016-12-20 19:26:34
What I can do to improve esitmates? Actual rows in about 10,000 and estimated is 500 billion (500,000,000,000) records.
Aaron Bertrand 2017-01-08 17:24:00
A WHERE clause with 15 AND/OR conditions quickly becomes difficult for any human, never mind the optimizer, to determine proper estimates. There are just too many possibilities to explore. Have you considered simplifying some of this for the optimizer by using #temp tables to store intermediate results, instead of trying to do everything in a single query with stacked CTEs?
2017-01-09 17:49:42
The major issue here is that computing the set pta_recent is very inefficient, as the query is currently written. The rest of the query is largely irrelevant from a performance perspective.
If you are unable to change the current indexing, consider the following query in place of pta_recent:
SELECT D.trc_number, -- pivot columns: first_pta = MAX(CASE WHEN CA.rn = 1 THEN CA.lph_id ELSE NULL END), second_pta = MAX(CASE WHEN CA.rn = 2 THEN CA.lph_id ELSE NULL END), third_pta = MAX(CASE WHEN CA.rn = 3 THEN CA.lph_id ELSE NULL END) FROM ( -- Find the distinct trc_number values using the index SELECT DISTINCT LH.trc_number FROM dbo.legpta_history AS LH WHERE LH.trc_number > '' ) AS D CROSS APPLY ( -- For each trc_number, find up to 3 highest lph_id values SELECT TOP (3) rn = ROW_NUMBER() OVER (ORDER BY LH2.lph_id DESC), LH2.lph_id FROM dbo.legpta_history AS LH2 WHERE LH2.trc_number = D.trc_number ORDER BY LH2.lph_id DESC ) AS CA GROUP BY D.trc_number;
This can use the existing indexing to obtain the set needed without any sorting:
This will be especially efficient if there are many lph_id per trc_number on average.
If you can change the indexing, consider replacing the existing index on trc_number with one on the same column, but sorted descending:
CREATE INDEX i ON dbo.legpta_history (trc_number DESC)
Then use this query:
SELECT N.trc_number, first_pta = MAX(CASE WHEN N.rn = 1 THEN N.lph_id ELSE NULL END), second_pta = MAX(CASE WHEN N.rn = 2 THEN N.lph_id ELSE NULL END), third_pta = MAX(CASE WHEN N.rn = 3 THEN N.lph_id ELSE NULL END) FROM ( SELECT LH.lph_id, LH.trc_number, rn = ROW_NUMBER() OVER (PARTITION BY LH.trc_number ORDER BY LH.lph_id DESC) FROM dbo.legpta_history AS LH WHERE LH.trc_number > '' ) AS N WHERE N.rn <= 3 GROUP BY N.trc_number;
Expected plan:
There are other refinements possible, including a filtered index, and/or keyed on (trc_number ASC, lph_id DESC) (to potentially allow a parallel plan).