Query slow after adding additional where clause
We are running SQL Server 2005.
I have a query that runs in around 18 seconds. After adding an additional clause to the where statement it takes close to 2 minutes. The additional where clause is:
sc.ChargeAmount IS NOT NULL
Also the shape of the query changes totally.
I've attached two query plans from SQL Sentry. One with and one without the extra where clause.
Edit:
Notice the thick lines in the above image. How do I get rid of them?
Any help would be appreceited.
Thanks
Alan
The whole query is the contents of a view that is used in multiple reports.
The view is sat upon tables apart from:
prism72ext.dbo.GetDivision (function)
prism72Ext.dbo.ServiceChargeDetail (view)
There are too many issues to fix properly in this setting, but the simple cause of the performance regression when adding the new filter is that the optimizer chooses a nested loops join on the basis of too-low cardinality estimation, and perhaps a data skew or ascending-key statistics problem.
The brutal 'fix' above is to disallow nested loops join, giving the optimizer only hash and merge to choose from. This was possible in this case because the optimizer was already choosing hash/merge for all but the problematic join.
Even so, this is very much a last resort, as I said. Database designs with complex nested views and non-inline functions can be problematic at best.
To help me in this type of situation in the future could you advise me on what made you think of using these hints?
I agree that the query is a bit of a mess. It is sat upon two databases, neither of which are well indexed (one is vendor supplied and has few indexes including clustered indexes, also that database is in 2000 compatibility mode). Also there is a lot of nesting of views and functions. This is primarily for code consistency. We have a lot of views with reports sat on them. Without nesting maintaining the code would be difficult. Are there better ways of re-using code?
You say this is a brutal fix. Are there any ramifications to this? I ask as I may use this elsewhere.
- If sc.ChargeAmount is a computed column, consider making it persisted and indexing it. This allows you to pay the cost of computation at insert/update time, instead of at query time, and will allow you to eliminate the non-null values easier. I would suggest a filtered index, but it seems from the query plans that you are on 2005. Something to consider if you plan to move to a supported version of SQL Server.
- Add Description as an include column to the ServiceType index on ispServiceType (or add a clustered index to that table). Either of these should eliminate a costly RID lookup that appears in one version of the plan.
- Make sure that your stats on all relevant tables are up to date, and try to run both queries again with an explicit statement-level RECOMPILE.
- Your case expression should change to this, especially if those datetime columns are indexed (and you may consider calculating "now" into a variable ahead of time):
... , Age = CASE WHEN dt < DATEADD(DAY, -120, GETDATE()) THEN '121+' WHEN dt < DATEADD(DAY, -90, GETDATE()) THEN '091-120' WHEN dt < DATEADD(DAY, -60, GETDATE()) THEN '061-090' WHEN dt < DATEADD(DAY, -30, GETDATE()) THEN '031-060' WHEN dt < GETDATE() THEN '000-030' ELSE '???' END , Description -- remove prefixes from these columns ... from ( -- put the entire current FROM as a subquery, except the WHERE at the end SELECT *, -- list out actual columns you need perhaps dt = CASE s.[Status] WHEN 'r' THEN s.CreateDateTime ELSE wr.WRCreateDateTime END FROM prism72.dbo.ispService s ... ) owner on wrs.WorkRequestId = owner.WorkRequestId ) wr on s.TopLevelServiceKey = wr.ServiceKey ) AS x where ( (Status = 'R') -- remove prefixes here too or (ServiceKey is not null) -- and here ) ...
- I am on SQL Server 2005. There is little I can do here.
- I've done this. I understand this is better, but it made no difference to the query run time.
- My stats are up to date. I us Ola Hallengren's script for this. I tried RECOMPILE, but it made no difference to the query run time.
- I can't refactor the case statement as it is dependent on a sub case statement. However, I don't think this would speed up the query much. I'm not sure where you got the "SELECT *" from as it does not appear in my query, or am I missing something?
I'm no expert, but I think the combined actions above would not bring down the speed dramatically, which is what I need.
I've updated my question to include a screen shot for part of the query that seems to bring back many more rows than needed. The "Estimated Data Size" is 3MB whereas the "Actual Data Size" is 4,979 MB. Could this be the source of my problems?
Thanks.
Sorry I misread #4. I tried your refactoring, unfortunately it didn't give a performance improvement. In fact the query plan looked just about identical to the original one.
I've since added some indexes to tables without a clustered index. Also to a couple of tables with high I/O. Before I did any tuning the query took 4m28s. It's now down to 1m32s.
I can see that pb_WorkRequestLog is still using a lot of I/O (2 million Logical Reads) and I'm struggling to get this down. Any ideas?
Thanks
I also note that pb_WorkRequestLog is used twice in that part of the plan. This is visible in the screenshot (far right) of the plan. There is still a big difference between estimated and actual rows, but nowhere near the other one.
I've attached the indexes for pb_WorkRequestLog to my question.
I've be rebuilding stats with a fullscan using Ola Hallengren's script run as follows:
EXECUTE [dbo].[IndexOptimize] @Databases = 'prism72,prism72ext',
@LogToTable = 'Y',
@FragmentationLow = NULL,
@FragmentationMedium = 'INDEX_REORGANIZE,INDEX_REBUILD_ONLINE,INDEX_REBUILD_OFFLINE',
@FragmentationHigh = 'INDEX_REBUILD_ONLINE,INDEX_REBUILD_OFFLINE',
@FragmentationLevel1 = 5,
@FragmentationLevel2 = 30,
@UpdateStatistics = 'ALL',
@StatisticsSample = 100
Looking closer at the query logic, there are a couple of things you might try, but without your actual table structure, sample data, and expected results, I am not 100% confident the results will be what you want.
The added where clause against ServiceChargeDetail turns the left join on that table into an inner join. So you can consider trying to push that filter into the join in one of two ways:
(1) simply moving it to the ON clause of the LEFT OUTER JOIN
(2) do (1) but change that LEFT OUTER JOIN to an INNER JOIN (I believe this will more accurately produce the results you are after, or at least the same results that you are getting now with the blade plan – whether this should return 1,521 rows or 4,282 rows or some other number of rows, I have absolutely no idea)
Again, I am not convinced this will solve the problem, but it may be enough to get SQL Server to think about the plan a little harder, and come up with something better.
Thanks for trying to help.