How to eliminate table spool
Example of a NLJ with residual predicate:
Example of a NLJ apply:
An NLJ apply will have Outer References, typically the joining key, and ideally there will be an Index Seek (not scan) on the inner side of the join using the outer reference.
You first need to review your indexing to ensure that an apply is possible. This means reviewing the joins with a residual predicate, to check that an index on the inner side table has the predicate column(s) in a suitable position in the index for seeking.
To take a simple example:
The tooltip for that join currently shows a residual predicate:
The predicate and output list on the Clustered Index Scan is:
So, to eliminate this table spool and replace the NLJ with an apply, you need an index that is seekable on QuizSetId and DataStateId (from the join predicate), and LanguageId (from the scan predicate), and which can return the LocalizedDisplayName column (from the output list). A suitable index would be:
CREATE INDEX i ON TEO.QuizSetsLoc (QuizSetId, DataStateId, LanguageId) INCLUDE (LocalizedDisplayName)
Obviously, you need to assess that index in terms of your overall workload. You may find it better to reorder the index keys, or you may find an existing index already meets these requirements, or an existing one can be easily modified.
If the optimizer still chooses an NLJ with residual for that join instead of the desired apply-with-seek, you may need to use a FORCESEEK hint. Note that it is possible the optimizer will still choose a table spool with an apply, but the chances are much lower.
That is the sort of analysis you need to do on each of those joins to ensure you have the correct indexes in place. For the more complex joins (e.g. the ones with joins and aggregates on the inner side) the process is basically the same, but you may need to adjust indexing (or use a hint) on more than one table.
On a more general note, your query is too large to expect much magic from the optimizer. There are too many joins, aggregates, and other things to expect accurate cardinality estimation, leading to incorrect plan shape choices. From the look of things, it should be easy to break this query down into simpler steps, using temporary tables to hold small intermediate results. Your payback for this effort will be better estimations, easy plan analysis, and better use of the optimizer's abilities.
The better information available to the optimizer should also help remove the table spools, assuming the number of expected apply duplicates is relatively low. Once the indexing is optimal, and the estimates are reasonable, you may need to plan guide to force a plan with no spools. As a last resort, you can use undocumented and unsupported trace flag 8690 to generate a plan with no spools of this type. This will allow you to capture a non-spool plan shape to use with a plan guide. You will not want to be using undocumented trace flags in a production system, unless guided to do so by Microsoft support services.
thanks for your explanation, I will follow your recommendation.