Appropriate statistics or indexes for this plan?
which already have nonclustered indexes:
dbo.tCreditOffersNomenclatureValues (NomenclatureParentID,CreditOfferID,NomenclatureID) INCLUDE (ID,Value) dbo.tCreditOffersNomenclatureValues (CreditOfferID,NomenclatureID) dbo.tCreditOffersNomenclature (NomenclatureParentID,CreditOfferID,NomenclatureID) INCLUDE (ID,ContactID) dbo.tCreditOffersNomenclature (CreditOfferID,NomenclatureID)
If I create following indexes the execution plan completely changes and "Nested Loops", "Lazy Spool" steps disappear, but I'm wondering is it possible to optimize this plan just with statistics without introducing new indexes with permuted columns:
dbo.tCreditOffersNomenclatureValues (CreditOfferID,NomenclatureID,NomenclatureParentID) INCLUDE (Value); dbo.tCreditOffersNomenclature (NomenclatureParentID,CreditOfferID, CDate DESC) INCLUDE (NomenclatureID,ContactID)
Any idea for appropriate statistics or other optimization trick?
This is also just an estimated plan, so I can't get a feel for how many of these estimates are anywhere near accurate. Are there really 128 trillion rows possible anywhere? Are your join conditions sensible or is it feasible that you have some unintended Cartesian products involved?
You definitely want to get rid of the two table spools (one is estimated to be executed 16 billion times!) and the two key lookups. But with the information I have available to me right now, I can't begin to suggest how to do that.
As Aaron noted, it's very hard to say on the basis of an estimated, anonymized plan alone, particularly where the plan is on the large side.
That said, from the general shape of things, it seems most of the effort is centred around finding the highest (or lowest) row from a grouping. There are two repeating patterns:
The above fragment is associated with row numbering and filtering for a particular row number (typically '1'). This is convenient to write, but the Segment & Sequence Project require a temporary end to parallelism, and all rows must be examined and numbered, not just the 'first' one we are interested in.
This second pattern hints at a similar task: joining to an aggregate derived from another table. The optimizer's decision to spool these subtrees is a side-effect of cost estimations that are highly unlikely to be accurate.
Rather than asking for very specific advice about the indexes and spools, I think you should be looking at ways to reduce the amount of work involved in locating the minmax rows. That would probably be a new question though, and would require details.