I'm trying to understand why the table spool outputs 15 million rows when the table is only 2500 rows and why the query is slow

The table row counts are in the Comments section

avatar image By Tony Green 37 asked Apr 19, 2016 at 03:23 PM
more ▼
(comments are locked)
10|10000 characters needed characters left

1 answer: sort oldest

The answer to the question in the title is in the properties of the table spool:

alt text

The spool replays 2,327 rows from its input for each of the 6,684 iterations of the nested loops join, giving 15,553,668 rows in total.

The query optimizer expected only 38 rows from the Sort. If that had been the case in practice, the sort and table spool strategy might have been ok. Now, you could avoid the spool with trace flag 8690 as described in the link, but I think it is more likely that there is an oversight in the query itself.

The logic is a little hard to follow without DDL (and you should always use table/subqueries aliases) but it appears that the MAX(Sequence) subquery should also be restricted to the given @patient_id.

You could do this by adding WHERE PatientId = @patient_id (with the proper table alias on PatientID) in the MAX subquery. If that is a valid interpretation of the requirement, you should try that and see how it affects the plan and performance.

I would be tempted to rewrite the outer WHERE clause (and adding aliases), so the query becomes:

 SELECT
     VD.Value AS VALUE,
     CAST(NULL AS DATETIME) AS RESULT_TIME,
     0 AS SEQ_NUM
 FROM dbo.VitalsData AS VD
 JOIN dbo.GdsCodeMap AS GCM
     ON GCM.CodeId = @type 
     AND GCM.FeedTypeId = VD.FeedTypeId
     AND GCM.Name = VD.Name
 JOIN dbo.TopicSessions AS TS
     ON TS.Id = VD.TopicSessionId
 WHERE
     EXISTS
     (
         SELECT 1
         FROM dbo.PatientSessionsMap AS PSM
         WHERE
             PSM.PatientId = @patient_id
             -- Same session as outer query
             AND PSM.PatientSessionId = TS.PatientSessionId
             -- Highest sequence for the given patient and session
             AND PSM.Sequence =
             (
                 SELECT 
                     MAX(PSM2.Sequence)
                 FROM dbo.PatientSessionsMap AS PSM2
                 WHERE
                     -- Same patient
                     PSM2.PatientId = @patient_id
                     -- Same PatientSessionId as the outermost query
                     AND PSM2.PatientSessionId = TS.PatientSessionId
             )
     );

For best results, there should be an index on dbo.PatientSessionsMap with the keys:

 (PatientId, PatientSessionId, Sequence)

sp.png (31.9 kB)
avatar image By SQLkiwi ♦ 6.6k answered Apr 21, 2016 at 12:07 PM
more ▼
(comments are locked)
avatar image runamuk0@msn.com Apr 21, 2016 at 05:45 PM

Hi, I tried your suggestion and the performance improved by orders of magnitude. :-) I also found that removing the outer patient_id predicate that it got even faster.

I attached another PE file with the original query, updated per your suggestion and your rewrite. Row counts and DDL are in the PE file comments section.

link text

avatar image SQLkiwi ♦ Apr 21, 2016 at 05:53 PM

That's good news.

10|10000 characters needed characters left
Your answer
toggle preview:

Up to 50 attachments (including images) can be used with a maximum of 209.7 MB each and 209.7 MB total.

Follow this question

Topics:

x641
x166
x5
x1

asked: Apr 19, 2016 at 03:23 PM

Seen: 76 times

Last Updated: Apr 21, 2016 at 05:53 PM