How to approach this business need on join condition
2013-08-22 03:44:44
Attached is create table definition and also copy of statistics that can be applied to the table and estimated execution plan. I have been thinking on how to efficiently to this join. Should I add many filtered indexes for every one condition and split the query into many pieces or ? The data in production will have over 600 million records. We can change the whole table design (still working on how to do this efficiently) at this point.
Thank you
link text
UPDATE s SET MtgId = m.MtgId FROM dbo.Pin_Mtg m INNER JOIN dbo.Pin_Mtg_Suppl s ON s.CntyCd = m.CntyCd WHERE s.Mtgid IS NULL --Assignments + Releases + Notice of Defaults AND (((s.PrimaryCatCd IN ('G', 'H', 'F') OR s.DocTyp IN ('CD', 'RD')) AND ((s.MtgRecordingBook <> '' AND s.MtgRecordingPage <> '' AND s.MtgRecordingBook = m.MtgRecordingBook AND s.MtgRecordingPage = m.MtgRecordingPage) OR (s.MtgDocYYDt <> 0 AND s.MtgDocNbr <> '' AND s.MtgDocYYDt = m.MtgDocYYDt AND s.MtgDocNbr = m.MtgDocNbr) OR (s.OrigDocDt <> 0 AND s.OrigDocNbr <> '' AND s.OrigDocDt / 10000 = m.MtgDocYYDt AND s.OrigDocNbr = m.mtgdocnbr) OR (s.OrigRecordingBook <> '' AND s.OrigRecordingPage <> '' AND s.OrigRecordingBook = m.MtgRecordingBook AND s.OrigRecordingPage = m.MtgRecordingPage))) --Mechanic liens OR (s.PrimaryCatCd = 'I' AND ((s.OrigDocDt <> 0 AND s.OrigDocNbr <> '' AND s.OrigDocDt / 10000 = m.DocYYDt AND s.OrigDocNbr = m.docnbr) OR (s.OrigRecordingBook <> '' AND s.OrigRecordingPage <> '' AND s.OrigRecordingBook = m.RecordingBook AND s.OrigRecordingPage = m.RecordingPage))) -- Modifications OR (s.DocTyp IN ('MO', 'CD', 'RD') AND ((s.OrigDocDt <> 0 AND s.OrigDocNbr <> '' AND s.OrigDocDt / 10000 = m.MtgDocYYDt AND s.OrigDocNbr = m.MtgDocNbr) OR (s.OrigRecordingBook <> '' AND s.OrigRecordingPage <> '' AND s.OrigRecordingBook = m.MtgRecordingBook AND s.OrigRecordingPage = m.MtgRecordingPage)))) OPTION (RECOMPILE)
2013-08-22 13:07:55
Some of the statistics statements in the setup script fail with an error. Some people will give up providing an answer if the setup script errors out. Perhaps it would be better to script statistics using the built-in features of SSMS rather than writing your own script?
The join confuses me because it makes the update non-deterministic. If you'd used MERGE to do this change, it would fail with an error. The join doesn't identify a single row in Pin_Mtg for each row of Pin_Mtg_Suppl. Hence the ANY aggregate after the join – the optimizer chooses an MtgId value essentially at random.
I would have expected a join on the primary key. Is there some unenforced logic that means only one row will qualify for the update? Otherwise, the query is just flawed at a fundamental level as it stands.
You need to resolve that issue before we talk about optimizing the details of the query.