Need help to optimize this plan

Anamikha 2018-04-11 06:45:24

1.There is a warning in the the Key Lookup having the highest operation cost %. It has a warning “Operations caused residual IO.The actual number of rows read was 342, but the number of rows returned was 180”. Need help/suggesstion/recommendations to fix this warning
2.Also, any other suggesstion to improve the performace of this query
SDyckes2 2018-04-11 18:48:25
The query you have uploaded runs very fast by most standards, the duration of this query was 151ms. Do you have a goal for the runtime? Sub 100ms? Or is this a subset of the data size and you are worried about how the query will scale?

Either way, the performance tuning is the same for this query. I always like to start by looking for the low-hanging fruit. I classify Key Lookups and Updating Statistics as low hanging fruit, or methods that do not require a large investment of time and effort.

Thank you for uploading both the Estimated and Actual plans for the query, this is very helpful in seeing so much additional data collected by Plan Explorer (PE).

Starting with the statistics, I see that some indexes have not had their stats updated since February and others not since the end of March. The tables and indexes seem to be very small, so updating the stats may have a minimal impact, but then we are talking about shaving milliseconds off a query. So updating the stats may provide a measurable performance increase.

Next, let's investigate the Key Lookups. There are 5 Key Lookups in this query, you specifically referenced the one with the highest cost, 45% of the IO + CPU estimated cost as shown on the Actual Plan, version 3. Let's take a look at this Key Lookup.

Key Lookup

How do we know what change to eliminate a key lookup? The Execution Plan will provide that information if you know where to look. In PE, hover over the Key Lookup and a tool-tip will appear with a lot of data related to that node of the plan. In this case look to the bottom in the Output List (see image below).

Tool Tip

The Output list contains the 2 columns the Key Lookup is providing to the query. In this case, 2 columns, *SM_PERSONNEL_MASTER_ID* and *SM_PERSONNEL_TYPE_ID*. A key lookup will sit next to the Index/Table Seek/Scan. So as we see from the image above, the non-clustered index, *SM_PERSONNEL_DTL.IX_PSNLDTL_ORG*, is used by the optimizer to gather most of the information requested by the query. But, not all the information is being provided, so the optimizer must also query the Primary Key to retrieve the missing data.

To fix this, we can add the 2 columns to the non-clustered index as INCLUDE columns. This improved index can now provide all the data to the query as requested, eliminating the need to access the Primary Key or eliminating the Key Lookup. If you look at the tool-tip image, you will see that there was a Residual IO warning due to the Key Lookup performing 342 actual reads and only returning 180 actual rows. If you look at the Table I/O tab of the tool, you will see that there were 2,563 Logical Reads, so eliminating this Key Lookup will be a 13% savings on the Logical Reads for the *SM_PERSONNEL_DATA* table. For this query, that will be a savings of a small number of milliseconds, but if we were talking millions of records, this could be multiple seconds or minutes.

This kind of information is available in the SSMS execution plan but is easier to spot in Plan Explorer (PE). The power of PE lies not only in the ease of reading the plans and spotting potential performance issues but also in the additional tools included.

If we highlight the non-clustered index as shown in the first image above, then choose the Index Analysis tab, we will have access to additional data that can help us further tune this index and query. If you have generated an Estimated or Actual Plan with PE, you will have access to the Index Analysis information. Looking at the image below, you can see we provide a list of all the indexes on the table, potentially all the columns in the table, which columns are used by the query, is it an output column, used as a sort, is it part of the predicate, and we display the last time the statistics were updated on a column and on an index (scroll to the bottom of the Index column). The hit percentage score of 26% is highlighted, the higher the score, the better fit the index is for the query.

Initial Index Analysis View

The magic of this tool is the ability to build a new theoretical index or theoretically modify an existing index. With our example, we will add the 2 columns to the index as INCLUDE columns. Our score goes up to a 48%.

Covering Index

But that is only 48%, can we get it higher? The tool provides the clues, several predicate columns that are not referenced by our index. With some analysis, I see I can add those columns as INCLUDE columns as well, and attain an 80% hit percentage score.

Fully Covering Index

With a little extra work, we now have an index that better covers the one Key Lookup in this query. If you investigate the other Key Lookups, you will find a couple of them also reference the *SM_PERSONNEL_DTL* table. Our new index just might help remove the other Key Lookups as well. We will not know until we test our new index. You can quickly script out the theoretical index we have created, scroll to the bottom of the index column and click the button, it will pop open a window with TSQL code to drop and add the new index.

Scritp Index

You now have the script needed to deploy this index to a Dev/Test environment to fully test the performance.