Many joins with the same table – how to optimize it? Is Index_Seek the correct way here?

Patryk 2014-01-09 14:22:32

Here is the definition of the view referenced in the query (V_Person_Ext):

SELECT
 person.[PERS_ID], person.[AdressNrAbacus], person.[Name], person.[NameZusatz], person.[Vorname], person.[Anrede_KAWE_ID], person.[PersonTyp_KAWE_ID], person.[DebitorenNr], person.[DebitorenNR_Tech], person.[geeignetAuslesung], person.[MwStNr], person.[Sprache], person.[Bemerkung], person.[Kuerzel], person.[PersonalNr], person.[LastAbacusFeed], person.[DRCK_ID], person.[erfasstVon], person.[erfasstAm], person.[geaendertVon], person.[geaendertAm]
      ,kTel.KontaktInformation as GeschaeftTel
      ,kTel2.KontaktInformation as PrivateTel
      ,kUrl.KontaktInformation as Url
      ,kFax.KontaktInformation as Fax
      ,kEmail.KontaktInformation as Email
      ,kMobile.KontaktInformation as Mobile
      ,kata.Code as Anrede
  FROM dbo.Person_T person
     LEFT OUTER JOIN dbo.S_KatalogWert kata ON kata.KAWE_Id = person.Anrede_KAWE_ID
     OUTER APPLY (SELECT TOP 1 KontaktInformation  
                    FROM dbo.Kontakt_T 
                   WHERE dbo.Kontakt_T.PERS_ID = person.PERS_ID AND KontaktTyp_KAWE_ID = 'C70623F2-DEE0-4254-B2D6-84E52D8F0E47' /* tel */
                  ) kTel 
     OUTER APPLY (SELECT TOP 1 KontaktInformation  
                    FROM dbo.Kontakt_T 
                   WHERE dbo.Kontakt_T.PERS_ID = person.PERS_ID AND KontaktTyp_KAWE_ID = '99CDAEEC-A196-4401-8559-6EC9A4B8CBDA' /* tel2 */
                  ) kTel2 
     OUTER APPLY (SELECT TOP 1 KontaktInformation
                    FROM dbo.Kontakt_T 
                   WHERE dbo.Kontakt_T.PERS_ID = person.PERS_ID AND KontaktTyp_KAWE_ID = '3BF3490A-7A22-4A5E-9139-5662FFB4EDDC' /* url */
                  ) kUrl
     OUTER APPLY (SELECT TOP 1 KontaktInformation
                    FROM dbo.Kontakt_T 
                   WHERE dbo.Kontakt_T.PERS_ID = person.PERS_ID AND KontaktTyp_KAWE_ID = 'BB63F69A-4D71-45BE-AF76-592166D1DFB3' /* fax */
                  ) kFax
     OUTER APPLY (SELECT TOP 1 KontaktInformation
                    FROM dbo.Kontakt_T 
                   WHERE dbo.Kontakt_T.PERS_ID = person.PERS_ID AND KontaktTyp_KAWE_ID = '02215D24-F821-4EE7-A1BA-9F2E299C8E54' /* email */
                  ) kEmail
     OUTER APPLY (SELECT TOP 1 KontaktInformation
                    FROM dbo.Kontakt_T 
                   WHERE dbo.Kontakt_T.PERS_ID = person.PERS_ID AND KontaktTyp_KAWE_ID = 'E6F35188-1925-4F4B-928B-663AE20B5D0C' /* mobile */
                  ) kMobile
Aaron Bertrand 2014-01-09 15:17:41
I think changing the view to a PIVOT would be preferable. The current approach of a series of OUTER APPLY (TOP 1) is far from optimal. This isn't fantastic either, but it should lead to a single scan and eliminate almost all of the nested loops.

SELECT person.PERS_ID, /* person.other columns */
  k.GeschaeftTel, k.PrivateTel, k.Url, k.Fax, k.Email, k.Mobile,
  Anrede = kata.Code
FROM dbo.Person_T AS person
LEFT OUTER JOIN dbo.S_KatalogWert kata 
  ON kata.KAWE_Id = person.Anrede_KAWE_ID 
LEFT OUTER JOIN
(
    SELECT 
      PERS_ID,
      GeschaeftTel = [C70623F2-DEE0-4254-B2D6-84E52D8F0E47],
      PrivateTel   = [99CDAEEC-A196-4401-8559-6EC9A4B8CBDA],
      Url =          [3BF3490A-7A22-4A5E-9139-5662FFB4EDDC],
      Fax =          [BB63F69A-4D71-45BE-AF76-592166D1DFB3],
      Email =        [02215D24-F821-4EE7-A1BA-9F2E299C8E54],
      Mobile =       [E6F35188-1925-4F4B-928B-663AE20B5D0C]
    FROM dbo.Kontakt_T AS k
    PIVOT (MAX(KontaktInformation) FOR KontaktTyp_KAWE_ID IN 
      ([C70623F2-DEE0-4254-B2D6-84E52D8F0E47],
       [99CDAEEC-A196-4401-8559-6EC9A4B8CBDA],
       [3BF3490A-7A22-4A5E-9139-5662FFB4EDDC],
       [BB63F69A-4D71-45BE-AF76-592166D1DFB3],
       [02215D24-F821-4EE7-A1BA-9F2E299C8E54],
       [E6F35188-1925-4F4B-928B-663AE20B5D0C])
    ) AS p
) AS k
ON person.PERS_ID = k.PERS_ID;

The Kontakt_T I reference here is expected to be a table (particularly because of the _T suffix). If this is a view (with a UDF to grab each specific kontakt value), you have more unraveling to do. View upon view upon view upon UDF upon potentially more views – this looks like a nightmare. If you want a better answer you should provide the table structure where this data actually resides, instead of presenting us with this rat's nest of views. πŸ™‚

Patryk 2014-01-09 15:44:08
Great, thanks a lot! I'll try it out tomorrow πŸ™‚

No, Kontak_T is a table, not a view. I've just sent you already changed definition of V_Person_Ext, because I started to 'optimize' it and my first thought was udf too – sorry for confusion. Getting rid of udf brought around 60/40 performance boost but I expect more from PIVOT idea.

EDIT (next morning…)

So, it works and results in only one table scan. I had to wrap the pivot result and group it by PERS_ID, but it was not a big issue. Finally the plan looks much, much better! Once again many thanks!

One more edit:

I think I found another way to get the same result even faster than with PIVOT:

SELECT PERS_ID, 
  MAX(CASE KontaktTyp_KAWE_ID WHEN 'C70623F2-DEE0-4254-B2D6-84E52D8F0E47' THEN KontaktInformation END) as GTel,
  MAX(CASE KontaktTyp_KAWE_ID WHEN '99CDAEEC-A196-4401-8559-6EC9A4B8CBDA' THEN KontaktInformation END) as PTel,
  MAX(CASE KontaktTyp_KAWE_ID WHEN '3BF3490A-7A22-4A5E-9139-5662FFB4EDDC' THEN KontaktInformation END) as Url,
  MAX(CASE KontaktTyp_KAWE_ID WHEN 'BB63F69A-4D71-45BE-AF76-592166D1DFB3' THEN KontaktInformation END) as Fax,
  MAX(CASE KontaktTyp_KAWE_ID WHEN '02215D24-F821-4EE7-A1BA-9F2E299C8E54' THEN KontaktInformation END) as Email,
  MAX(CASE KontaktTyp_KAWE_ID WHEN 'E6F35188-1925-4F4B-928B-663AE20B5D0C' THEN KontaktInformation END) as Mobile
FROM Kontakt_T
 WHERE KontaktTyp_KAWE_ID in (
 'C70623F2-DEE0-4254-B2D6-84E52D8F0E47',
 '99CDAEEC-A196-4401-8559-6EC9A4B8CBDA',
 '3BF3490A-7A22-4A5E-9139-5662FFB4EDDC',
 'BB63F69A-4D71-45BE-AF76-592166D1DFB3',
 '02215D24-F821-4EE7-A1BA-9F2E299C8E54',
 'E6F35188-1925-4F4B-928B-663AE20B5D0C'
 )
GROUP by PERS_ID

Any comments would be appreciated

DaveBally 2014-01-09 14:41:25
I would suspect that the problem is the use of the udf get_KatalogWertId as a predicate.
udfs perform very poorly.

Can you rework the logic in it into the main query ?

Edit :

You also appear to have fallen foul of a common design problem. That being the EAV model , http://en.wikipedia.org/wiki/Entity%E2%80%93attribute%E2%80%93value_model, these also perform poorly ( as you have only just started to find out) is it to late to redesign to a 'proper' relational model ?

Patryk 2014-01-09 14:53:33
Yes I can and I'll try to get rid of this udf. Thanks!
Aaron Bertrand 2014-01-09 15:01:42
Weird, I don't even see this UDF referenced. Also, the contact information seems to be normalized the way I would do it (rather than keep adding columns to a flat table as more contact types are added), it just doesn't squarely meet the needs of this particular query.
DaveBally 2014-01-09 15:16:57
Aaron, this is what i mean, you not seeing that ?

http://postimg.org/image/9znqgbe89/

Sorry for the URL cant do another answer for a jpg

Agree, potentially πŸ™‚ , about the EAV model. If you need to do more than lookup on entity and attribute you start running into problems. Show me all contacts who have green eyes , will cause you grief etc…

Aaron Bertrand 2014-01-09 15:20:58
Thanks, I do now, because I didn't have Seek Predicates as a column. I didn't realize Kontakt_T was a view.

Filtered indexes can help quite a bit with specific patterns in EAV. I'm sure I've thrown this URL at you before, but I think EAV can be quite effective without suffering too many problems:

http://sqlblog.com/blogs/aaron_bertrand/archive/2009/11/19/what-is-so-bad-about-eav-anyway.aspx