There is a warning in the plan on the Hash Match operator, is this something i should worry about? Also the whole query takes 16-20 seconds and the biggest cost is in the Object6.Index4 Index Seek (which also does quite a lot of scans) Is there a way to reduce the cost of the operator? Would it help to reduce the cost?
By Anonymous User 16 asked Dec 21, 2015 at 01:09 PM
Sure. It is a recursive spill (level 2) and won't he helping.
You have to remember that cost percentages are always estimates, and are based on the optimizer's model for costing, which is highly unlikely to match your hardware.
Do not be misled into thinking that 45.8% of the runtime cost of this plan was consumed by this seek. SQL Server does not currently measure actual runtime performance per plan operator.
Even if the execution plan row count estimates and row sizes were exactly correct at runtime (which they most certainly are not in this case), it is highly unlikely the seek would consume 45.8% of the cost (resources/time) on your particular hardware.
The point of having a cost model is to provide the optimizer with a way to choose between the options it explores when deciding how to return the results logically specified by your query.
All that said, it is most important to understand why the optimizer chose the plan it did, and to assess whether its choices were reasonable given the runtime information we can see.
From this point of view, it is clear that the optimizer chose a plan that features non-blocking operators (e.g. the Hash Match Flow Distinct and Nested Loops Joins) because you asked for 10,000 rows in no particular order (there is no
Constructing a plan that aims to return 'x' rows quickly, rather than being optimized for the full potential result size, is known as setting a Row Goal. It is pretty common in this sort of plan that things can perform much less well than expected if the cardinality estimates are incorrect, or the distribution of data is such that the expected rows are not actually encountered as quickly as the optimizer assumed.
That is indeed the case here; almost every operator in the plan ends up doing significantly more work than was expected to find 10,000 rows that qualify. The root cause of this is just as likely to be down to where the sought values are positioned in the index (data distribution) as it is to unrepresentative statistics. This means you might well not be able to improve the plan simply by updating statistics or providing new ones.
As far data distribution is concerned, the optimizer assumes that data is evenly distributed, which is very often not a valid assumption to make. There is no general solution to this issue for all situations, but you can at least tell if this assumption about data distribution is causing your problems by recompiling and running the query with documented trace flag 4138 enabled.
For test purposes, you can enable this for the query by adding the query hint
There is no guarantee that the 4138-enabled plan will be better, but the chances are pretty good, in my experience. You can use 4138 in a production system, though setting a trace flag requires sysadmin privilege, which may be an issue for you. You could choose to implement the hint using a Plan Guide, or simply force the plan shape using a similar method or
By SQLkiwi ♦ 6.6k answered Jan 03, 2016 at 10:01 AM