I'm trying to understand why the table spool outputs 15 million rows when the table is only 2500 rows and why the query is slow

Tony Green 2016-04-19 15:23:51

The table row counts are in the Comments section

SQLkiwi 2016-04-21 12:07:31
The answer to the question in the title is in the properties of the table spool:

alt text

The spool replays 2,327 rows from its input for each of the 6,684 iterations of the nested loops join, giving 15,553,668 rows in total.

The query optimizer expected only 38 rows from the Sort. If that had been the case in practice, the sort and table spool strategy might have been ok. Now, you could avoid the spool with trace flag 8690 as described in the link, but I think it is more likely that there is an oversight in the query itself.

The logic is a little hard to follow without DDL (and you should always use table/subqueries aliases) but it appears that the MAX(Sequence) subquery should also be restricted to the given @patient_id.

You could do this by adding WHERE PatientId = @patient_id (with the proper table alias on PatientID) in the MAX subquery. If that is a valid interpretation of the requirement, you should try that and see how it affects the plan and performance.

I would be tempted to rewrite the outer WHERE clause (and adding aliases), so the query becomes:

SELECT
    VD.Value AS VALUE,
    CAST(NULL AS DATETIME) AS RESULT_TIME,
    0 AS SEQ_NUM
FROM dbo.VitalsData AS VD
JOIN dbo.GdsCodeMap AS GCM
    ON GCM.CodeId = @type 
    AND GCM.FeedTypeId = VD.FeedTypeId
    AND GCM.Name = VD.Name
JOIN dbo.TopicSessions AS TS
    ON TS.Id = VD.TopicSessionId
WHERE
    EXISTS
    (
        SELECT 1
        FROM dbo.PatientSessionsMap AS PSM
        WHERE
            PSM.PatientId = @patient_id
            -- Same session as outer query
            AND PSM.PatientSessionId = TS.PatientSessionId
            -- Highest sequence for the given patient and session
            AND PSM.Sequence =
            (
                SELECT 
                    MAX(PSM2.Sequence)
                FROM dbo.PatientSessionsMap AS PSM2
                WHERE
                    -- Same patient
                    PSM2.PatientId = @patient_id
                    -- Same PatientSessionId as the outermost query
                    AND PSM2.PatientSessionId = TS.PatientSessionId
            )
    );

For best results, there should be an index on dbo.PatientSessionsMap with the keys:

(PatientId, PatientSessionId, Sequence)
runamuk0@msn.com 2016-04-21 17:45:17
Hi,
I tried your suggestion and the performance improved by orders of magnitude. 🙂 I also found that removing the outer patient_id predicate that it got even faster.

I attached another PE file with the original query, updated per your suggestion and your rewrite. Row counts and DDL are in the PE file comments section.

link text

SQLkiwi 2016-04-21 17:53:26
That's good news.