Query running slow

PavanA 2015-05-14 09:54:26

Kin 2015-05-14 14:22:42
The actual vs estimated row estimates are way off. Do you have stats updated ? [cdm].[dbo].[pe_to_imports] and [cdm].[dbo].[pe_imports]. Also there is a keylookup, so if you include checksum column as part of index, it will get rid of it.
SQLkiwi 2015-05-23 09:05:36
From an execution plan point of view, the problem is the query optimizer expects to find the required 2000 matches quite quickly, so it chooses a strategy based on nested loops joins, particularly the Left Anti Semi Join. Using Plan Explorer to show the estimated row counts, we can see how the optimizer expected things to work:

Estimated plan

If the optimizer's estimates had been accurate, the results would have been returned very quickly. However, switching to actual rows mode shows that things did not go as expected:

Actual plan

The query does not produce any rows, let alone the 2000 matches the optimizer expected to find relatively quickly. So, what should have been a quick search using nested loops became an exhaustive search for all rows, with the lowest seek (on pe_imports) being executed 165,604 times.

Your query contains a large number of predicates (filtering conditions) that make it hard for the optimizer to assess selectivity accurately. You are also using an old version of SQL Server (2005) which means the normal solution for this type of problem is not available. My suggestions are:

  1. Remove the 'magic go-faster' nolock hints
  2. Ensure all columns in the query are properly qualified with an alias
  3. Avoid using difficult to estimate bit flag columns
  4. Upgrade to a supported version of SQL Server
  5. Break the NOT IN subquery out to a separate query
  6. Save the subquery results in a temporary table, indexed appropriately
  7. Only use TOP in a query if you really need to limit the results
  8. Only use LOB types (e.g. the checksum column) if the value really could exceed 8000 bytes
  9. If you need to keep the TOP, try rewriting the query to fetch a maximum of 2000 unique key values only, store these in a temporary table, then fetch the remaining non-key columns in a separate query.
  10. Avoid mixing many AND and OR conditions in a single query. Breaking the operation into sensibly-sized steps can help the optimizer.
  11. As a last resort, consider using query or join hints to force a hash or merge strategy if you know nested loops will not find the matches quickly.