Unnecessary reads in Query with OFFSET / FETCH NEXT and subselect

Thomas Franz 2016-02-11 10:04:59

If I run the query below on AdventureWorks with OFFSET 0 (= first page) it does 50 times execute the subselect (as expected).

But when I set the offset to e.g. 400 it executes the subselect 450 times, although only 50 lines are shown and the result of the subquery does not effect which / how many rows are returned.

Is there a way to bypass this (and do only 50 reads on BusinessEntity when going to the next pages)?

SELECT *
  FROM (SELECT p.FirstName, p.MiddleName, p.LastName,
               (SELECT be.ModifiedDate
                  FROM Person.BusinessEntity AS be
                 WHERE be.BusinessEntityID = p.BusinessEntityID
               ) be_modified
          FROM Person.Person AS p
       )  AS s
    ORDER BY s.LastName
        OFFSET 0 ROWS
 FETCH NEXT 50 ROWS ONLY
 ;
Aaron Bertrand 2016-02-19 21:07:05
Here is your plan in its current form:

your plan

Notice that you have multiple operators that need to read all 450 rows – the 50 you need in your result, and the 400 that appear before them in alphabetical order on last name. Also notice that because you used the seek into BusinessEntity as a subquery, that clustered index seek was executed 450 times (one for each row, including the rows you end up discarding). This is because those rows are retrieved before any filtering due to OFFSET / FETCH.

I think you'll find this form has better performance:

;WITH pg AS
(
  SELECT BusinessEntityID
    FROM Person.Person
    ORDER BY LastName
    OFFSET 400 ROWS
    FETCH NEXT 50 ROWS ONLY
)
SELECT p.FirstName, p.MiddleName, p.LastName, be_modified = be.ModifiedDate
  FROM Person.Person AS p
INNER JOIN pg ON p.BusinessEntityID = pg.BusinessEntityID
INNER JOIN Person.BusinessEntity AS be
ON p.BusinessEntityID = be.BusinessEntityID
ORDER BY p.LastName; -- maybe OPTION (RECOMPILE) here

This produces the following plan:

my plan

Which looks a little more complex, but… while you still have to read 450 rows, this is only in the skinny index scan operation that retrieves the keys from Person. The other operations only need to look at the 50 relevant rows that come back from the CTE. By joining to BusinessEntity instead of using a pre-filter subquery, that seek now only has to be executed 50 times (one for each actual row that needs to be materialized in the output).

Where you see the biggest impact is in I/O. Take a look at reads (yours above, mine below):

alt text

Now run again with OFFSET 4000 instead of 400:

alt text

When you get toward the end of the table, the index scan against Person ends up having a higher I/O cost (more reads), but it still ends up with a faster duration, due in part to lower CPU. This is at OFFSET 19500:

alt text

Note that your original query produces a slightly different plan toward the end of the table, because it hit some tipping point somewhere. (My version kept the same plan throughout.)

alt text

(Whether you should change strategy when you get toward the end of the table may depend on your own implementation, and models using AdventureWorks might not be wholly representative.)

This is shamelessly lifted from my work in this article:

Thomas Franz 2016-02-22 08:47:47
Thank you very much, I'll try to figure out, if I can apply it to our app.

The main problem is, that the result is used for a grid on a web page and the customer can filter for nearly every column.

This may be no problem, if he filters for a column in the CTE, but if he applies a very specific filter to the main query he will end with e.g. 50 pages with only 2-5 rows per page (instead 3 pages with 50 rows).