Can I increase the performance of this query?

Yoten 2017-02-28 20:22:50

Plan Explorer here

I've got a (SQL Server) stored procedure that is used to flatten out a bunch of audit trail-related data for some reporting. The data is passed back to a C# program, and the database model was created using Entity Framework code-first which by default uses clustered primary key indexes and non-clustered indexes for all foreign key relationships.

The main table, Object14, contains the audit records and has a little under 10 million rows. The next-largest table, Object17, has the parent objects (i.e. the data itself) that are actually being audited and is about 2.5 million rows, while the rest of the tables are much much smaller. Running the query in SSMS takes around 14-18 minutes and outputs about 6 million rows. I was wondering if it could be sped up in some way. The two big expenses I see are key lookups on one of the "Object21" table's indexes.

The top-left one has "Column34" in its output list which we actually have an index on already since it's a foreign key to the "Object23" table that you see one join to the left — however, this index does not appear to be used for some reason.

The other key lookup on the lower-right is on a date field (Column19) and has no index. It's being caused by the last part of the WHERE clause, where we have:

WHERE 
  ...
  AND cda.[Date] = (SELECT MAX([DATE]) FROM CrfDataAudits cda2 WHERE cda2.CrfData_Id = cd.Id AND cda2.[Date] <= DATEADD(ss,30,fda.[Date]))

…which is attempting to match the audit record's date with a GRANDPARENT's date, giving it a 30-second window due to the two records being saved NEAR each other, but not at the exact same time.

Does the current execution time (14-18 minutes) seem reasonable for this query, or is there some way to speed it up? Can the relatively-expensive key lookups be replaced with something else?

Aaron Bertrand 2017-03-07 02:05:09
You can always eliminate key lookups by adding those columns to the key or include list for the chosen index. Of course, making the index wider is not always an easy decision, because not only does this impact the write side of your workload, it also makes the chosen index wider, reducing its benefit (a narrow index + key lookup is often chosen because it is much more efficient than a wider non-clustered index or the clustered index). Of course in this case it is likely chosen because the estimate is 1 row, and then your actuals end up with 10 million rows. It is not clear to me on first glance exactly why the estimates are terrible – it might be more obvious if you can provide a non-anonymized, actual execution plan generated from within Plan Explorer – this will ensure all runtime metrics are included and stats/index information is collected. In the end, though, this query returns 10 million rows – if you are consuming those in a Management Studio grid or in a client-side application, I would have to guess that is going to cause the bulk of any perceived delays.