Why does the second query perform so much better than the first, but does more logical reads?

Mark Freeman 2017-05-31 18:09:29

A developer presented me with the first query (note that this is the simplest query that shows the issue. The original is much more involved.) and we couldn't see why the optimizer was choosing to read hundreds of thousands of rows from the table with many rows when it could simply scan the 5,304 rows from the table with fewer rows and then seek into the big one to get the 1,570 rows that match.

In the second version, forcing the loop join resolved that and improved performance, but does a lot more logical reads. This is the output with STATISTICS IO, TIME ON set:

SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.

SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 6 ms.

(1570 row(s) affected)
Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'OrderFraud'. Scan count 1, logical reads 1098, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'ChargebackReview'. Scan count 1, logical reads 50, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)

SQL Server Execution Times:
CPU time = 78 ms, elapsed time = 75 ms.

(1570 row(s) affected)
Table 'OrderFraud'. Scan count 0, logical reads 16252, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'ChargebackReview'. Scan count 1, logical reads 50, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)

SQL Server Execution Times:
CPU time = 15 ms, elapsed time = 10 ms.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.

SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.

Rob Farley 2017-06-01 00:09:49
The Query Optimizer chooses the Hash + Scan because the IO is so much smaller. On a system which is memory-bound, many of those logical reads might've been physical reads and the time taken could've been a lot higher. On a system where those 1098 pages can stay in memory, doing 16252 logical reads across them might end up quite cheap. Building a hash table with 5304 rows and then probing it 440924 times might be cheaper for IO, but has a bigger footprint in other ways.

So the performance has the potential to be a lot worse with 16252 reads. And that's why the QO makes that choice.

Mark Freeman 2017-06-01 18:44:03
Thanks, Rob! One of the shortcomings of the QO is that it doesn't take I/O speed into account. In our environment, the data is likely to be in memory and even if not, we're on SSD so physical I/O is much less costly than it is expecting. The QO seems heavily weighted in favor of reducing I/O even if it has to burn a lot of CPU to compensate.
Rob Farley 2017-06-02 05:13:00
Yes. I think we'll see this weighting change over time. Historically, physical IO has been the one to avoid. With that less-often the case, I think we'll see ways of communicating this to the QO. In the meantime, your hint is probably useful (but generally, avoid hints when possible).
Mark Freeman 2017-06-03 00:34:24
Agreed. I think I use about one query hint per year. ๐Ÿ™‚