Unnecessary reads in Query with OFFSET / FETCH NEXT and subselect
But when I set the offset to e.g. 400 it executes the subselect 450 times, although only 50 lines are shown and the result of the subquery does not effect which / how many rows are returned.
Is there a way to bypass this (and do only 50 reads on BusinessEntity when going to the next pages)?
SELECT * FROM (SELECT p.FirstName, p.MiddleName, p.LastName, (SELECT be.ModifiedDate FROM Person.BusinessEntity AS be WHERE be.BusinessEntityID = p.BusinessEntityID ) be_modified FROM Person.Person AS p ) AS s ORDER BY s.LastName OFFSET 0 ROWS FETCH NEXT 50 ROWS ONLY ;
Notice that you have multiple operators that need to read all 450 rows – the 50 you need in your result, and the 400 that appear before them in alphabetical order on last name. Also notice that because you used the seek into BusinessEntity as a subquery, that clustered index seek was executed 450 times (one for each row, including the rows you end up discarding). This is because those rows are retrieved before any filtering due to OFFSET / FETCH.
I think you'll find this form has better performance:
;WITH pg AS ( SELECT BusinessEntityID FROM Person.Person ORDER BY LastName OFFSET 400 ROWS FETCH NEXT 50 ROWS ONLY ) SELECT p.FirstName, p.MiddleName, p.LastName, be_modified = be.ModifiedDate FROM Person.Person AS p INNER JOIN pg ON p.BusinessEntityID = pg.BusinessEntityID INNER JOIN Person.BusinessEntity AS be ON p.BusinessEntityID = be.BusinessEntityID ORDER BY p.LastName; -- maybe OPTION (RECOMPILE) here
This produces the following plan:
Which looks a little more complex, but… while you still have to read 450 rows, this is only in the skinny index scan operation that retrieves the keys from Person. The other operations only need to look at the 50 relevant rows that come back from the CTE. By joining to BusinessEntity instead of using a pre-filter subquery, that seek now only has to be executed 50 times (one for each actual row that needs to be materialized in the output).
Where you see the biggest impact is in I/O. Take a look at reads (yours above, mine below):
Now run again with OFFSET 4000 instead of 400:
When you get toward the end of the table, the index scan against Person ends up having a higher I/O cost (more reads), but it still ends up with a faster duration, due in part to lower CPU. This is at OFFSET 19500:
Note that your original query produces a slightly different plan toward the end of the table, because it hit some tipping point somewhere. (My version kept the same plan throughout.)
(Whether you should change strategy when you get toward the end of the table may depend on your own implementation, and models using AdventureWorks might not be wholly representative.)
This is shamelessly lifted from my work in this article: