Why does a query inside a IF EXISTS statement not perform the same as outside

tbrickle 2018-04-19 21:03:43

I need help understand what option is causing this issue.

The query outside the if exists, returns in 1 seconds.

The query inside the if exists, takes 1.5 minutes to complete.

Can you help me understand what is the possible issue and how I might correct this issue?

Hugo Kornelis 2018-04-22 14:58:18
The reason this happens is because EXISTS directs the optimizer to try to find a plan that quickly finds the first matching row. In most cases, the result is an execution plan that would take more time for finding all rows, but is quicker to arrive at the first row.

In your specific case, this backfires. Expecting to find a match fast, the execution plan uses Nested Loops joins and other constructions that are better for small sets of data but scale bad. But the expectation was wrong, there was no match at all so the Nested Loops join ends up processing ALL rows.

The original plan uses Hash Match for the same join, which has a higher startup cost but scales much better.

In this specific case (and really in this specific case only!!) you might get better performance by replacing IF EXISTS with what I normally consider an anti-pattern: "IF (SELECT COUNT() FROM …) > 0" (In this case, COUNT() would replace DISTINCT gc.Contract_Code in your query.

Again, I cnanot stress this enough: in 99.9% of all cases I would do the reverse, I have already improved performance of countless procedure by replacing the COUNT() > 0 check with an EXISTS check. This is really a weird exception. When you make this change, I suggest adding a huge comment block to explain the reason. And create a recurring task for yourself to re-check the performance of the two alternatives every two or three months or so. When your data changes, you may revert to a more typical pattern and then the COUNT() will suddenly start to hurt instead of help you.

Hugo Kornelis 2018-04-22 15:00:21
In my answer, the Forum interface has mistaken every star I used as the argument for COUNT as start or end of an text in italics.

So wherever you see COUNT() it should be COUNT("star") [not the text "star" but the symbol]. And you can ignore the italics

tbrickle 2018-04-23 15:06:31
Thank you very much for posting your findings and help. That explains a lot as to why it is happening.