Window function vs group by method: find value for MAX ID

Yorik 2015-11-23 14:10:05

I'm trying to understand the performance issue with Window function when i select value for MAX ID

--Old group By method
SELECT  
    tj.MainProcedure ,
    tj.StartTime ,
    tj.EndTime ,
    tj.Threshold,
    tj.LastStatus ,
    tj.rc
    ErrorMessage
FROM dbo.tbl_TraceJobExecutions tj WITH (NOLOCK)
INNER JOIN 
    (
        SELECT b.MainProcedure,
        ErrorMessage
        FROM
            (
            SELECT MainProcedure, MAX(RowID) AS MaxRowID
            FROM tbl_TraceJobExecutions_ErrorHandling WITH (NOLOCK)
            GROUP BY MainProcedure
            )a
        INNER JOIN tbl_TraceJobExecutions_ErrorHandling b WITH (NOLOCK)
        ON b.MainProcedure = a.MainProcedure
        AND b.RowID = a.MaxRowID
        ) eh
ON eh.MainProcedure = tj.MainProcedure
 
ORDER by tj.MainProcedure, tj.EndTime DESC
 
--VS Window Function
SELECT tbl_TraceJobExecutions.*,MaxVal.ErrorMessage FROM tbl_TraceJobExecutions
INNER JOIN (
    SELECT 
        MAX(tbl_TraceJobExecutions_ErrorHandling.RowID) OVER (PARTITION BY tbl_TraceJobExecutions_ErrorHandling.MainProcedure) AS MaxID,
        tbl_TraceJobExecutions_ErrorHandling.MainProcedure,
        RowID,
        ErrorMessage
    FROM tbl_TraceJobExecutions_ErrorHandling
) AS MaxVal ON MaxVal.MainProcedure = tbl_TraceJobExecutions.MainProcedure AND 
 
MaxVal.RowID = MaxVal.MaxID 
ORDER by tbl_TraceJobExecutions.MainProcedure, tbl_TraceJobExecutions.EndTime DESC

The IO statistic:

(10 row(s) affected) Table 'tbl_TraceJobExecutions'. Scan count 0, logical reads 46, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'tbl_TraceJobExecutions_ErrorHandling'. Scan count 1, logical reads 8, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)

(10 row(s) affected) Table 'Worktable'. Scan count 3, logical reads 1776, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'tbl_TraceJobExecutions_ErrorHandling'. Scan count 1, logical reads 23, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'tbl_TraceJobExecutions'. Scan count 1, logical reads 79, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)

What is wrong with my code? Thank you

Aaron Bertrand 2015-11-23 16:12:14
Can you generate the plans from within Plan Explorer? This way we'll have much more of the runtime data available (right now we can only rely on your textual output of stats i/o to see anything other than estimated costs).
Yorik 2015-11-24 07:57:57
Hi Aaron,

The actual plan is attached to the post.

Thank you:)

Aaron Bertrand 2015-11-24 16:57:06
@Yorik, the plans currently attached to the question are actual plans, but they were captured in Management Studio and then opened in Plan Explorer (you can tell because they are missing key metrics in the Statement grid, like Duration and CPU). What I meant by generating them from within Plan Explorer is to take the two queries, paste them into the Command Text tab, and then click the Get Actual Plan button.
SQLkiwi 2015-11-30 11:18:52
The first query form makes it easier for the query optimizer to understand that you are looking for one row per group. It recognises the pattern and uses a Segment and Top combination to split the incoming rows into groups and take the top (1) in the desired order. You can read more about this operation here: http://sqlblog.com/blogs/paul_white/archive/2010/07/28/the-segment-top-query-optimisation.aspx

The second query may express the same logical query specification (though I didn't examine it in detail), but the use of a windowing function means the optimizer no longer applies the 'Segment Top' optimization. The resulting execution plan is a relatively unsophisticated translation of the original query: the table spool area of the plan finds the MAX per group then applies it to ever row, taking the window specification literally. The Filter before the join then reduces the set to the one row per group you're actually after.

In summary, the first query form is easier for the optimizer to work with, and benefits from an optimization that the second one does not benefit from.

The other issue is to understand what you mean by "performance issue". While it is likely the first query will execute more quickly, the second one is not as bad as might be suggested by simply comparing the "logical reads". The first query will count (at least) one logical read per seek and lookup, but the scan will report one logical read per page of rows read from the scan. This is just the normal way logical reads are counted, but it can mislead people.

The second query counts logical reads the same way for the scans (per page) and lookup (at least one, probably more depending on the depth of the b-tree) but it also counts a logical read for every row read from the table spool. In a sense, spools look unfairly expensive when compared using logical reads alone. The details are esoteric, but essentially you can think of spools as reporting one or more logical reads per row read.

To conclude, I would prefer the first query form because it is simpler and makes better use of the optimizer's abilities. To compare real-world performance, I would run the two queries from Plan Explorer (as Aaron suggested) not SSMS. Logical reads can be a misleading metric, so I more usually focus on whatever is actually important in context. Often, this will be elapsed time, CPU usage, or physical reads. There's really not enough information in the question to say anything definitive.