Why is the sort operator needed in this plan

GokhanVarol 2013-04-10 14:47:25

I cannot see why sort is needed in the queryplan (refer to attachment and screenshot)? It seems to me the query plan does not need a sort whatsoever, tehre is a keplookup and a clustered index seek following the sort which I don't believe they require a sort.
Thank you

link text

SQLkiwi 2013-04-11 09:17:29
The sort is an optimization to turn random I/O from the nonclustered index seek key order into sequential I/O against the clustered index – a Key Lookup is, after all, just a singleton Clustered Index Seek.

The sort is spilling on multiple passes turning an optimization attempt into a bit of a horror. Fix the cardinality estimate coming from the seek – it is far too low right now, causing the spill. You do not want a magic trace flag to turn off the sort optimization – fix the underlying cause.

The residual predicate on the seek, and the nasty predicates in the Filter immediately after are always going to be a cardinality estimation nightmare. It's only 240K rows; materialize the set in a temporary table.

Edit: Based on the additional information in the comments, there are two things you can do:

  1. If the 'best' plan is fixed, force the plan you want with a plan guide. If you cannot find a parameter value to avoid the sort, manually remove it from the XML show plan and use that to apply a plan guide.
  2. If all else fails, generate a plan with TF 8738. This will remove the pre-lookup sort optimization.
GokhanVarol 2013-04-11 13:23:15
The tables in production is larger (about 80+ times larger). I did not want to create a temp table for them. I played around a little and decided to hide the filter that causing the wrong estimates behind an optimizer for variable (@TransFilter_Fake) which seemed to work both on production and my test system.
I am trying to avoid temp tables over 100 million rows, does it make sense ?
Thanks

DECLARE @StartDate DATETIME = '1/1/2000'
, @EndDate DATETIME = '3/1/2013'
, @TransFilter_Fake BIT = 1

SELECT ……….
FROM #DataSupplier cnty
INNER JOIN Trans.vTrans tt ON tt.CntyCd = cnty.CntyCd
OUTER APPLY ( SELECT ts.StdZipCd
FROM tTrans.TransSitus ts WITH ( FORCESEEK, INDEX = 1 )
WHERE tt.CntyCd = ts.CntyCd
AND tt.BatchDt = ts.BatchDt
AND tt.BatchSeq = ts.BatchSeq
–AND pp.CntyCd IS NULL
AND tt.PclId IS NULL
) ts
LEFT JOIN tTax.Property AS pp ON cnty.CntyCd = pp.CntyCd
AND tt.PclId = pp.PclId
AND tt.PclSeqNbr = pp.PclSeqNbr
INNER JOIN #CBSA AS cbsa ON LEFT(pp.SitusStdZipCd, 5) = cbsa.zip
LEFT OUTER JOIN #CntyConformingLoanAmt AS ll_cnty ON SUBSTRING(CAST(ISNULL(tt.SaleDt, tt.RecordingDt) AS VARCHAR), 1, 4) = ll_cnty.DateTo
AND cnty.CntyCd = ll_cnty.CntyCd
LEFT OUTER JOIN #GlobalConformingLoanAmt AS ll_default ON SUBSTRING(CAST(ISNULL(tt.SaleDt, tt.RecordingDt) AS VARCHAR), 1, 4) = ll_default.DateTo
WHERE ( @TransFilter_Fake = 0
OR ( tt.PrimaryCatCd IN ( 'A', 'B' ) /* Arms Length/Non-Arms Length Purchase /
AND ISNULL(tt.SaleDtDerived, tt.RecordingDtDerived) BETWEEN @StartDate AND @EndDate
AND SUBSTRING(CAST(ISNULL(tt.SaleDt, tt.RecordingDt) AS VARCHAR), 5, 2) <> '00' /
Ignore sales for month 00*/
)
)
AND ( ll_cnty.CntyCd IS NOT NULL
OR ll_default.DateTo IS NOT NULL
)
AND ( tt.SaleDtDerived IS NOT NULL
OR tt.RecordingDtDerived IS NOT NULL
)

OPTION ( RECOMPILE, OPTIMIZE FOR ( @TransFilter_Fake = 0 ) );

SQLkiwi 2013-04-19 03:34:07
I have updated my answer based on the extra information in your comment. It includes a magic trace flag.
Dan Holmes 2013-04-10 14:57:58
I don't really know but i am going to guess so that when someone else answers that really does know, i can be on record for being wrong (or right).

I think it might allow read-aheads on the lookup. I can't find the read-ahead statistics in the PE output to guess.

SQLkiwi 2013-04-11 09:18:16
Good guess, but prefetching does not require a sort like that.
Dan Holmes 2013-04-11 12:59:35
i was closer than i thought since i based my guess on the reads being faster if they were sequential but i thought that would have only helped in a pre-fetching situation. thanks for the link.
arvindravish 2013-04-10 15:28:17
Is the table partitioned on PclId? It looks like the sort is fetching rows from all partitions and then sorting it to prevent rewinds before the join. Try creating a non-aligned non clustered index on CntyCd, BatchDt, BatchSeq columns on Trans table.
GokhanVarol 2013-04-10 15:30:53
Partition key is on CntyCd char(5) on tables in tTrans and tTax Schema.
GokhanVarol 2013-04-10 15:31:26
The clustered index on Trans table is already CntyCd,BatchDt,BatchSeq
GokhanVarol 2013-04-10 16:59:56
Interesting to note, I disabled prefetching and even that turns off the prefetching in nested loops query plan still have a sort in it.

OPTION (RECOMPILE, QUERYTRACEON 8744)
https://www.simple-talk.com/sql/performance/sql-server-prefetch-and-query-performance/