The query below is generated by Business Objects. It runs for over 20 minutes before I cancel it. without the EXCEPT operator, each query runs in about 6 seconds. Is there a reason why it runs so slow with the EXCEPT operator?
By mylifeandsql 1 asked Jun 26 at 12:55 PM
If I isolate the queries, the right query returns no record. I am unable to attach the plan, but here is the link to download it: https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwetransfer.com%2Fdownloads%2Fd01be9244741740c0a67de8d3ee95c5320180626200009%2F4e7f9642f2a57210d729b73cd39e63e420180626200009%2F1347e4&data=02%7C01%7Cmike.miller%40ellucian.com%7Cb155985e8d54461ee12a08d5db9f722d%7Cba4f1b25f4f74403892553e24140459f%7C0%7C0%7C636656400219227583&sdata=0KthXOYtoThtp4c1DzL2czTkZ1DHEV3Odqw7B%2BbpVfE%3D&reserved=0
By mylifeandsql 1 answered Jun 26 at 08:02 PM
So here are a few things I see in the execution plan and that are probably NOT the cause of your pain:
My biggest concern in the excution plan is the top-right operator in both of the branches. It is a Clustered Index Scan on table STUDENT_ACAD_LEVELS. This table is not in your query so I assume that you are working with views. The table holds almost 120,000 rows so using a Clustered Index Scan to retrieve a subset of the rows is already non-optimal. The predicate is on a start date (<=getdate), end date (>getdate OR null), and on a weird substring function that (if I see it correctly) takes a column STUDENT_ACAD_LEVELS_ID and strips it off everything before a '*' character; the result then has start with GR or UG and have non-zero length (which is redundant give the first test).
What mostly bothers is not that no seek is used (which given the predicate would have been impossible), but the terrible cardinality estimation. Estimated rows is 235; actual is 5489 - almost 25 times as many. And given the location of this operator in the plan, the estimate of 235 has driven a lot of decisions. May operators are repeated once per row from this table, which is okay (ish) for 235 rows but can become a major issue with 5489 rows.
Are your statistics up to date? If no, then update them and see if you get better estimations and a better execution plan. Also, the predicate on "the part of a column that follows the * character" is a strong indication that your database is not properly normalized; that part of the column should almost certainly have been a separate column. (Which would have been much easier to index, and to get good cardiniality estimations for).
Another major issue is the STPR_DATES table. This table is accessed four times in total (twice in the upper part, twice in the lower part). All of the times its clustered index is scanned. A seek would be better - and in this case easy to obtain for the predicate of start date < getdate and end date > getdate or NULL by having a covering index on start_date. SQL Server estimates that 7277 rows will be returned per execution; in reality it's just 6357. Now you may think that scanning 205K rows to read 6537 rows may be a minor issue and that would be right if it happens once. But because of the issue above, each of the four copies of this operator runs 5486 times. And reading 200K rows (4 * 5486) is actually a lot of work.
There are also several index spools and table spools in the execution plan. These are sometimes needed for getting correct results, sometimes used to prevent even worse performance, but they can be true performance killers. It's hard to tell if they are actually the main issue here, and even harder to tell you how to fix this without knowing more about your query, view definitions, and data. If you run this query with SET STATISTICS IO ON, then I expect that you will see one or more lines marked Worktable with a high number of logical reads.
I have also looked over the text of your query in amazement. The second part is (unless I miss something) almost exactly the same as the first part, except that in the second part one extra predicate (L27RTP_STUDENTS.STU_ACTIVE_HIATUS_CODE = 'W') is added. Again, knowledge of data is needed to determine wether this is actually exactly the same, but I think that it would be easier to add L27RTP_STUDENTS.STU_ACTIVE_HIATUS_CODE <> 'W' to the first query (plus null handling if needed) and remove the entire except construction.
If you are sure that you really need the EXCEPT condition, then at least rewrite it as NOT EXISTS. The benefit is that now you only need to return the key values instead of all the columns, so you can simplify the second query (e.g. that FOR XML construction is probably not needed a second time at all). This will also greatly reduce the amount of work needed for the optimizer.
These are just a few pointers. As mentioned before, I do not know how your views are defined, what tables you have, how your data is distributed, what constraints are in place, etc.
All of the above does not answer your original question: how it is possible that each of the two queries runs in 6 seconds but the combination takes over 20 minutes. I could not see that from the execution plan. The most liekly explanation is that without the EXCEPT the optimizer somehow comes up with a completely different plan, that works much better - but I have no idea why. (Nor even whether it is true at all since I do not know how the execution plans for the individual queries look).
By Hugo Kornelis 271 answered Jun 27 at 12:54 PM