COALESCE vs ISNULL and nested ISNULLs

Dan Holmes 2016-01-22 18:46:55

The plans attached (estimates only) show the difference of the same query with the exception that one uses COALESCE and the other is ISNULL. Much has been written about the difference before. COALESCE expands to a CASE expression and the datatype is determined by evaluating all expressions in the function. ISNULL doesn't behave that way. I get that part. I don't know how to use that knowledge and explain the difference in these plans.

Aaron Bertrand 2016-01-22 18:55:57
Looks like nested ISNULLs does a better job in this case, avoiding the hash matches that make the COALESCE version so expensive. But I think it may be a red herring that that is the only difference – it seems that in the good query the index seek on tblCalendar only estimates 337 rows and has more (and different) predicates than the clustered index seek used in the other query, which estimates 6,574 rows. I'd have to do a diff to spot any other differences in the query text, but is it possible these plans were constructed at different times (when different stats were in place) and/or with different values for the @Date parameter? I noticed the parameters tab doesn't include compiled or runtime values. Is the behavior the same with OPTION (RECOMPILE);?
Dan Holmes 2016-01-22 19:13:03
The text is the same. I started with the COALESCE text and made the ISNULL text from it. My only changes were those changes.

The estimates on that table are what is messing with my head. Why would that change because of an ISNULL vs COALESCE?

Aaron Bertrand 2016-01-22 19:15:19
Internal implementation? There are other behavior differences between them as well, for example I highlight some observed differences in this tip. Paul might have some more concrete ideas when he swings around next. In the meantime, I would say go with the one that performs better.
SQLkiwi 2016-01-23 11:48:52
Although COALESCE and ISNULL expressions can sometimes express the same logical requirement (if written very carefully), there are many internal differences that mean cardinality estimation and plan shape choices will often be different.

The query in question is too complex, with too many unknown factors (view definitions, data types, column nullability and so on) to analyse in detail, but:

  • As noted in the question, the two have different behaviours with respect to data types. Where this introduces implicit type conversions, cardinality estimation can be affected.
  • COALESCE is expanded to a fixed form of CASE expression, which cannot be split into component parts or moved around the plan tree in the same way nested ISNULL expressions can. In short, ISNULL generally allows the optimizer more freedom. This extra freedom can result in better or worse plans, of course.
  • SQL Server can reason about nullability with ISNULL but not so much with COALESCE. This can enable simplifications if an expression known not to be nullable (internally, during compilation) has this function applied. It can also affect cardinality estimation by removing null steps from intermediate histograms.
  • Cardinality estimation for ISNULL may be performed using different logic than for CASE WHEN x IS NOT NULL THEN y ELSE …. More generally, there is no guarantee that logically equivalent predicates expressed using different syntax will produce the same estimation, or take the same path through optimization. In fact, most often they will not, especially in non-trivial cases. Small changes in CE can have large effects on final plan shape.

It is not sensible to say which will be better on average. Personally, I prefer to avoid both wherever possible. I also much prefer to write simpler queries with purely relational operators. In my experience, this leads to the best results from the optimizer, and the most predictable results.

The queries in the question are good examples of (estimated) low-cost queries where the optimizer will not spend much time (by design) trying to find anything better than a broadly reasonable solution. Most choices in this type of query are based on heuristics and guesses rather than detailed CE calculations anyway.