Is there anything I can do to improve the overall duration of this merge statement?

GokhanVarol 2013-05-31 14:20:25

Can this run in parallel or can the spills be prevented? There are 64 cores on the server we can utilize if needed.
Can the duration of this merge speed up ?
The target table is partitioned (has indexed view on it which is forcing a wide update regardless etc)
Thank you
SQLkiwi 2013-06-02 05:46:57
> Can this run in parallel

I can't immediately see a reason that this query could not use parallelism, but you could try compiling it with TF 8649 to check that. MERGE often produces inefficient parallel plans though – see later comments.

…or can the spills be prevented?

I don't see any spill warnings in the posted plan.

I believe the optimizer do check if the values building the index were actually modified even though the row is being modified and if not it does not modify the index.

Yes. The Filters before the various Index Updates and Inserts are checking to see if the indexed values have changed; if not, the change is skipped.

Can the duration of this merge speed up?

Probably. I have generally found that MERGE does not produce plans that perform well with larger sets, and have had success by rewriting UPSERT merges as two separate INSERT and UPDATE statements. This may seem counter-intuitive, but it has been my experience. It can also be easier to get 'natural' parallelism with the separate statements.

Side note: I very much prefer to express the test (x IS NULL AND y IS NOT NULL) OR (x IS NOT NULL AND y IS NULL) OR (x <> y) as NOT EXISTS (SELECT x INTERSECT SELECT y). This rewrite is more compact and often produces more efficient plans.

GokhanVarol 2013-06-02 15:33:41
intersect and exists / not exists does the same job or ?
SQLkiwi 2013-06-03 03:01:58
Yes, see this blog post
GokhanVarol 2013-06-03 04:01:17
I love the intersect syntax. It sdded sbout 10+ percent to the duration. I wish there would be a way to eliminate the nested loops.
GokhanVarol 2013-06-03 13:28:50
After some through testing I found out in this particular case Merge statement is performing better compared to separate Insert / Update.
SQLkiwi 2013-06-04 04:30:18
Interesting, that's not the result you shared with me via email. Feel free to send me updated plans if you like.
GokhanVarol 2013-06-04 19:37:44
I know this is a long query plan and little difficult to work, but not sure how easy I can make this to be presented.
There are 9 merge statements towards the bottom all got converted to update + Insert (in the order) (table list MERGE tTax.LiProperty
MERGE tTax.PropertyLandDimensionDetail
MERGE tTax.FiProperty
MERGE tTax.PropertyLot
MERGE tTax.PropertyLegalPartyParsed
MERGE tTax.PropertyLegalPartyOrigName
MERGE tTax.PropertyLegalPartyName
MERGE tTax.PropertyLegalParty
MERGE tTax.Property
)
If more than 2 column to be compared in update I used the intersect to compare if not regular 3 step compare.
I noticed in some cases merge did parallelism and the update did not and the duration was higher because of that.
I have a difficulty making a decision right now use the merge or update / insert or is there more suggestions?
Thank you

link text

SQLkiwi 2013-06-04 20:15:59
I'll try to look in detail later. There is a trade-off, both with INTERSECT and MERGE vs separate INSERT/UPDATE/DELETE. Regarding MERGE, as I said before, I have had success replacing it with separate I/U/D on large sets. For smaller sets or simpler plans (or where MERGE goes parallel and the alternative does not) … etc etc etc … it might pay to stay with MERGE. There's rarely an 'always better' option with these things.
kfraz 2013-05-31 15:40:10
I would wrap the operators in () to show proper grouping. It looks like you only care if the "ON" statement matches when it also matches "t.AbsenteeOwnrIrisCd IS NULL AND s.AbsenteeOwnrIrisCd IS NOT NULL" Else the "ON" doesn't have to match but the "t.AbsenteeOwnrIrisCd IS NOT NULL AND s.AbsenteeOwnrIrisCd IS NULL" must match.

If this is what you wanted then OK but if it isn't then see below.

I would change to
WHEN MATCHED AND(

ISNULL(t.AbsenteeOwnrIrisCd,") <> ISNULL(s.AbsenteeOwnrIrisCd,")

OR

ISNULL(t.CntyLegalTxt,") <> ISNULL(s.CntyLegalTxt,")

    ...........

)

GokhanVarol 2013-05-31 17:21:32
Actually using ISNULL function in compare is slower than 3 part compare (that's not much compared to the cost of this query), that's the reason column level compare code is written the way it is (it could early exit in 3 part compare). There can be surrounding parenthesis to show the statements more clear but that's a coding preference does not change the execution.
GokhanVarol 2013-05-31 17:31:13
… and using isnull(val,") <> isnull(val2,") will consider
val = "
val2 = NULL
as equal whereas they are not
kfraz 2013-05-31 19:59:23
Correct. I don't see alot of code were people know the diffence between " and NULL so often NULL and " are used interchangably. If you distiguish the difference in your table then this would not provide valid results.
kfraz 2013-05-31 20:37:58
Looking at the execution plan again the only thing I see is the Index Insert for Property.PrevPclId and the Index Update for Property.SitusAddrStreetName_Ind account for 70% of the Est Cost but they are doing the least amount of inserts – updates. I would look to see if the indexes fragmented or doing a lot of page splits due to the insert or update.
GokhanVarol 2013-05-31 21:42:11
I believe the optimizer do check if the values building the index were actually modified even though the row is being modified and if not it does not modify the index.