Could this query be more optimal?
Also the whole query takes 16-20 seconds and the biggest cost is in the Object6.Index4 Index Seek (which also does quite a lot of scans)
Is there a way to reduce the cost of the operator? Would it help to reduce the cost?
Sure. It is a recursive spill (level 2) and won't he helping.
Also the whole query takes 16-20 seconds and the biggest cost is in the Object6.Index4 Index Seek (which also does quite a lot of scans) Is there a way to reduce the cost of the operator? Would it help to reduce the cost?
You have to remember that cost percentages are always estimates, and are based on the optimizer's model for costing, which is highly unlikely to match your hardware.
Do not be misled into thinking that 45.8% of the runtime cost of this plan was consumed by this seek. SQL Server does not currently measure actual runtime performance per plan operator.
Even if the execution plan row count estimates and row sizes were exactly correct at runtime (which they most certainly are not in this case), it is highly unlikely the seek would consume 45.8% of the cost (resources/time) on your particular hardware.
The point of having a cost model is to provide the optimizer with a way to choose between the options it explores when deciding how to return the results logically specified by your query.
All that said, it is most important to understand why the optimizer chose the plan it did, and to assess whether its choices were reasonable given the runtime information we can see.
From this point of view, it is clear that the optimizer chose a plan that features non-blocking operators (e.g. the Hash Match Flow Distinct and Nested Loops Joins) because you asked for 10,000 rows in no particular order (there is no ORDER BY clause).
Constructing a plan that aims to return 'x' rows quickly, rather than being optimized for the full potential result size, is known as setting a Row Goal. It is pretty common in this sort of plan that things can perform much less well than expected if the cardinality estimates are incorrect, or the distribution of data is such that the expected rows are not actually encountered as quickly as the optimizer assumed.
That is indeed the case here; almost every operator in the plan ends up doing significantly more work than was expected to find 10,000 rows that qualify. The root cause of this is just as likely to be down to where the sought values are positioned in the index (data distribution) as it is to unrepresentative statistics. This means you might well not be able to improve the plan simply by updating statistics or providing new ones.
As far data distribution is concerned, the optimizer assumes that data is evenly distributed, which is very often not a valid assumption to make. There is no general solution to this issue for all situations, but you can at least tell if this assumption about data distribution is causing your problems by recompiling and running the query with documented trace flag 4138 enabled.
For test purposes, you can enable this for the query by adding the query hint OPTION (QUERYTRACEON 4138). This will result in a plan without row goals – so it will be optimized for the full potential result set size. You will likely find the plan will no longer feature so many loop joins and almost certainly not a Flow Distinct.
There is no guarantee that the 4138-enabled plan will be better, but the chances are pretty good, in my experience. You can use 4138 in a production system, though setting a trace flag requires sysadmin privilege, which may be an issue for you. You could choose to implement the hint using a Plan Guide, or simply force the plan shape using a similar method or USE PLAN hint.