Why does a table have a large amount of iterations

Jeff @jlangdon 2013-04-23 14:15:11

The extStaging table shows a clustered index scan with an Estimated Operator Cost of 3.3% There are 14k rows per iteration and the table is iterated 856 times for a total of 12 M reads.

I know there is plenty wrong with the is query, but just trying to focus on certain aspects. Any suggestions would be appreciated.

SQLkiwi 2013-04-23 15:16:50
To answer the question directly, the reason is that an earlier cardinality estimate is wrong. SQL Server expects to execute that whole branch just once, based on one row arriving at the driving nested loops outer join. The particular point where the error occurs is shown below:

Cardinality error origin

The Hash Match Inner Join is estimated to produce one row, whereas in fact it produces 13,767. This is reduced slightly to 13,619 by the time the stream reaches the Nested Loops Outer Join that drives the branch you refer to, and further reduced to 869 rows by a series of operations on the same branch immediately before the operation of interest:

Down to 869 rows

If it weren't for the reduction caused by the Left Semi Join, this particular issue would be worse because the scan would be executed 13,619 times not 869.

The query is so large that it is unlikely you will be able to resolve all the cardinality estimation problems, even trying to tackle one issue at a time. You might reduce the impact of this particular incorrect estimation by adding a suitable index to extStaging:

Looking at the predicate on the Nested Loops Inner Join at the far left of the plan extract above, the index needs to help with CaseID = [Expr1446]. The Compute Scalar on the inner side of the join defines [Expr1446] as CONVERT(int, CaseID), and the scan itself has a predicate RecordImportedOn_CAWI IS NOT NULL AND CaseID IS NOT NULL.

The extStaging table needs an index on CaseID (as an integer) to achieve a seek. The index should also include the RecordImportedOn_CAWI column, and the other columns listed in the Output List of the scan. That change would help with that particular issue, but you would need to review the other problematic extStaging accesses to determine the best indexes to add.

All that said, my preference would be to rewrite this query completely, in much smaller and simpler steps. No doubt there are reasons for its current complexity – perhaps you are forced to join complex views for some reason – but it would be remiss of me not to mention that the current query and execution plan is well beyond sensible or reasonable.

Dan Holmes 2013-04-23 14:27:01
Because the row counts coming into this operator vary from the estimate to the actuals. The estimate shows 1 row coming out of the nested loop operator. That means the CI Scan would only happen once and there would be ~13k rows. In this case the actual row count was 869 so you got that many table scans and ~12mil rows.

That is the why. The fix is probably farther to the right and fixing the SQL/Stats/Indexes such that the estimates are closer to the actuals.