Where should I start?

Jared Karney 2015-10-08 17:25:01

I have remedied the tempdb spill, but then it ran slower. I have tried limiting reults by using temp tables; slower. I have removed the order by; slower… Not sure what else I can do. Original execution time for a month is 27 seconds. EXEC dbo.GetMovedLoads '2015-09-01', '2015-10-09'

Procedure definition:

ALTER PROCEDURE [dbo].[GetMovedLoads] (
    @StartDate datetime = NULL,
    @EndDate datetime = NULL,
    @CompanyID INT = 1
    -- SET NOCOUNT ON added to prevent extra result sets from
    -- interfering with SELECT statements.
    DECLARE @EntityType_LoadCustomer AS INTEGER;
    SET @EntityType_LoadCustomer=12;
    SELECT @StartDate = IsNull(Convert(char(8), @StartDate, 112), '1/1/1900');
    IF @EndDate IS NULL
            SELECT  @EndDate = MAX([L].[LoadDateStripped])
            FROM    [dbo].[Load] AS [L];
    LO.ID as 'Load #'
    ,TDET.Code as 'Mode'
    ,TDET5.Code as 'Type'
    , CASE  WHEN (LO.Mode = 1)
                    THEN TDET1.Code
            WHEN (LO.Mode = 2)
                    THEN TDET1.Code
            WHEN (LO.Mode = 3)
                    THEN TDET2.Code
            END AS 'ProgressType'
    ,LO.ProgressType as 'LoadProgressType'
    ,LO.LoadDate as 'LoadDate'
    ,MONTH(LO.LoadDate) AS 'LoadDateMonth'
    ,CAST(SUBSTRING(convert(varchar, YEAR(LO.LoadDate)), 3, 2) AS INTEGER) AS 'LoadDateYear'
    ,LCUS.Name as 'Customer'
    ,EMP1.Code as 'Cust Rep'
    ,EMP1.[Group] as 'CustGroup'
    ,TDET6.Code AS 'CustDivision'
    ,TDET4.Code as 'CustGroupCode'
    ,commCustRep.RepCommission as 'Cust Rep $'
    ,FAC1.Name as 'Shipper'
    ,CIT1.Name + ', ' + CIT1.StateCode  as 'Origin'
    ,FAC2.Name as 'Consignee'
    ,CIT2.Name + ', ' + CIT2.StateCode as 'Destination'
    ,LO.Miles as 'Miles'
    ,LCAR.Name as 'Carrier'
    ,EMP2.Code as 'Carrier Rep'
    ,EMP2.[Group] as 'CarrGroup'
    ,TDET7.Code AS 'CarrDivision'
    ,TDET3.Code as 'CarrGroupCode'
    ,commCarrRep.RepCommission as 'Carr Rep $'
    ,LO.TotalRate as 'Rate'
    ,LO.TotalCost as 'Cost'
    ,(LO.[TotalRate] - LO.[TotalCost]) as 'GrossMargin'
    ,(LO.[TotalRate] - LO.[TotalCost] - LO.[OverHead]) as 'NetMargin'
    ,COALESCE(pcus.[Name], LCUS.Name, '') AS 'ParentCustomer'
    , CASE WHEN U.ID > 0 THEN U.Code ELSE '' END AS 'BookByRep'
    , CUSBC.Code AS 'CustRepLoc'
    , CARBC.Code AS 'CarRepLoc'
    ,LO.ID AS 'LoadID'
    ,lcus.customerid AS 'CustomerID'
    ,CIT1.ID AS 'OriginCityID'
    ,CIT2.ID AS 'DestCityID'
    , LCUS.OpsGroup AS 'OpsGroup'
    ,LO.EquipmentType AS 'EQType'     
    -- Begin Equipment Formatting
    IsNull(LO.EquipmentType, '')
    Case IsNull(LO.EquipmentLength, 0)
        When 0 Then ''
        Else ' ' + Cast(LO.EquipmentLength As varchar(6))
    Case IsNull(LO.EquipmentWidth, 0)
        When 0 Then ''
        Else ' ' + Cast(LO.EquipmentWidth As varchar(6))
    Case IsNull(LO.EquipmentHeight, 0)
        When 0 Then ''
        Else ' ' + Cast(LO.EquipmentHeight As varchar(6))
    As 'EQ'
    -- End Equipment Formatting
    ,LO.RoutingRankType AS 'RoutingRankType'
    ,CASE WHEN (SELECT COUNT(*) FROM LoadIncident LI WHERE LoadID = LO.ID AND ResolvedDate = '1753-01-01 00:00:00' AND ResolvedBy = 0) > 0
            THEN 1
        ELSE 0
        END AS 'OpenIncidents'    
    ,LI.TotalIncidentsCount AS  'TotalIncidentsCount'
    ,ISNULL(LRD.TotalRate, 0) + ISNULL(LA.AdjustmentAmount,0) - ISNULL(LP.TotalPayments,0) AS 'Balance'
    ,RQ.CustomerLaneIdentifier AS 'CustomerLaneIdentifier'
    FROM dbo.Load LO 
    INNER JOIN dbo.LoadCustomer LCUS on LCUS.LoadID = LO.ID and LCUS.Main = 1
        AND LO.LoadDate >= @StartDate
        and LO.LoadDate < @EndDate + 1
        AND LO.CompanyID = @CompanyID
        AND LO.StateType = 1
    LEFT JOIN dbo.TypeDetails TDET on TDET.ID = LO.Mode and TDET.TypeCategoryID = 132
    LEFT JOIN dbo.TypeDetails TDET1 on TDET1.ID = LO.ProgressType and TDET1.TypeCategoryID = 85
    LEFT JOIN dbo.TypeDetails TDET2 on TDET2.ID = LO.ProgressType and TDET2.TypeCategoryID = 182
    LEFT JOIN dbo.TypeDetails TDET5 on TDET5.ID = LO.Type and TDET5.TypeCategoryID = 69
    CROSS APPLY (SELECT TOP 1 LS.LoadId, LS.FacilityId
                FROM dbo.LoadStop LS
                WHERE (LS.[TYPE] = 1 OR LS.[TYPE] = 5)
                AND  LS.LoadID = LO.ID
                ORDER BY LS.Sequence ASC) LS1   
    INNER JOIN dbo.Facility FAC1 on FAC1.ID = LS1.FacilityID
    INNER JOIN dbo.City CIT1 on CIT1.ID = LO.OriginCityID
    CROSS APPLY (SELECT TOP 1 LS.LoadId, LS.FacilityId
                FROM dbo.LoadStop LS
                WHERE (LS.[TYPE] = 2 OR LS.[TYPE] = 6)
                AND  LS.LoadID = LO.ID
                ORDER BY LS.Sequence DESC) LS2  
    INNER JOIN dbo.Facility FAC2 on FAC2.ID = LS2.FacilityID
    INNER JOIN dbo.City CIT2 on CIT2.ID = LO.DestinationCityID
    LEFT JOIN dbo.LoadCarrier LCAR on LCAR.LoadID = LO.ID and LCAR.IsBounced = 0 and LCAR.Main = 1
    LEFT JOIN dbo.LoadRep LREP1 on LREP1.LoadID = LO.ID and LREP1.EntityType = 12 and LREP1.EntityID = LCUS.ID and LREP1.main = 1 
            SELECT ie.employeeId AS ID, ie.CompanyBranchID, ie.Division, ie.[group], su.code
            FROM dbo.InternalEmployee ie
            INNER JOIN dbo.Person p ON ie.EmployeeID = p.PersonID
            LEFT OUTER JOIN dbo.SystemUser su ON ie.EmployeeID = su.UserID
                ) EMP1 on EMP1.ID = LREP1.EmployeeID
    LEFT JOIN (SELECT su.UserID AS ID, su.code
                FROM dbo.SystemUser su
                INNER JOIN dbo.Person p ON su.UserID = p.PersonID) U
        ON U.ID = LCAR.BookByUserID
    LEFT JOIN dbo.CompanyBranch CUSBC ON CUSBC.ID = EMP1.CompanyBranchID
    LEFT JOIN dbo.TypeDetails TDET6 ON TDET6.id = EMP1.Division AND TDET6.TypeCategoryid = 18
    LEFT JOIN dbo.LoadRep LREP2 on LREP2.LoadID = LO.ID and LREP2.EntityType = 13 and LREP2.EntityID = LCAR.ID and LREP2.main = 1 
            SELECT ie.employeeId AS ID, ie.CompanyBranchID, ie.Division, ie.[group], su.code
            FROM dbo.InternalEmployee ie
            INNER JOIN dbo.Person p ON ie.EmployeeID = p.PersonID
            LEFT OUTER JOIN dbo.SystemUser su ON ie.EmployeeID = su.UserID
            ) EMP2 on EMP2.ID = LREP2.EmployeeID
    LEFT JOIN CompanyBranch CARBC ON CARBC.ID = EMP2.CompanyBranchID
    LEFT JOIN dbo.TypeDetails TDET7 ON TDET7.id = EMP2.Division AND TDET7.TypeCategoryid = 18
    LEFT JOIN dbo.LoadCommission commCustRep on commCustRep.EmployeeID = LREP1.EmployeeID AND commCustRep.LoadID = LO.ID AND commCustRep.EntityType = 12 AND LCUS.ID = commCustRep.EntityID
    LEFT JOIN dbo.LoadCommission commCarrRep on commCarrRep.EmployeeID = LREP2.EmployeeID AND commCarrRep.LoadID = LO.ID AND commCarrRep.EntityType = 13 AND LCAR.ID = commCarrRep.EntityID
    LEFT JOIN dbo.TypeDetails TDET3 on (EMP2.[Group] = TDET3.ID and TDET3.TypeCategoryId = 20)
    LEFT JOIN dbo.TypeDetails TDET4 on (EMP1.[Group] = TDET4.ID and TDET4.TypeCategoryId = 20)
    LEFT JOIN dbo.Customer Cus ON cus.id = lcus.customerid AND ISNULL(cus.parentCustomerid, 0) <> 0
    LEFT JOIN dbo.Customer pCus ON pcus.id = cus.parentCustomerid   
    LEFT JOIN (SELECT   lrd.LoadId, ISNULL(SUM(lrd.[Amount]), 0) AS TotalRate
                    FROM    [dbo].[LoadRateDetail] lrd
                    WHERE   lrd.EntityType = 12
                    GROUP BY lrd.LoadId) LRD
                ON LRD.LoadId = LO.Id                           
    LEFT JOIN (SELECT   lp.LoadID, ISNULL(SUM(lp.[AppliedAmount]), 0) AS TotalPayments
                     FROM   [dbo].[LoadPayment] lp
                     WHERE  lp.EntityType = 12
                     GROUP BY lp.LoadId) LP
                ON  LP.LoadId = LO.Id
    LEFT JOIN (SELECT   la.LoadId, ISNULL(SUM(la.Amount), 0) AS AdjustmentAmount
                     FROM   [dbo].[LoadAdjustment] la
                     WHERE  la.EntityType = 12
                     GROUP BY la.LoadId) LA
                ON  LA.LoadId = LO.Id
    LEFT JOIN dbo.LoadRate LR 
                ON LR.LoadID = LO.Id  AND LR.OriginalQuoteRateLineItemID <> 0  AND LR.EntityType=12
    LEFT JOIN dbo.RateQuote RQ 
                ON RQ.ID = LR.OriginalQuoteRateLineItemID AND RQ.CustomerLaneIdentifier IS NOT NULL AND RQ.CustomerLaneIdentifier <> ''
                SELECT COUNT(1) AS TotalIncidentsCount
                FROM dbo.LoadIncident li
                WHERE li.LoadID = LO.ID
                ) LI
    ORDER BY LO.LoadDate DESC;

Added updated plan to show the changes I made. Will continue to work on the query for #Balance. Thanks!

Aaron Bertrand 2015-10-08 19:11:40
The plan XML does not maintain the entire statement, is there any chance you could post the whole query somewhere? I count 26 tables involved in this query, and there isn't much hope for SQL Server to simplify that without substantial rewrites. It is often simpler to use intermediate storage (e.g. #temp tables) for some of the smaller inputs and removing the source tables from the larger query, especially if the data there is accessed via scans. But without the whole query it will be challenging to suggest anything specific.
Jared Karney 2015-10-09 14:07:01
This morning I started by taking the source table (Load) out of the main query and inserting the data into a temp table using the filters for that table. I then started looking at where my biggest pipes were and using that table to make a couple of smaller temp tables to use in the final query. This seems to have helped for larger date ranges, but is negligible with a small range. Overall, I think doing what you said, Aaron, was helpful. Thanks!
SQLkiwi 2015-10-09 14:34:31
You don't say how long the query currently runs for, and the attached file was imported from SSMS rather than generated by running the query directly from Plan Explorer, so duration and useful metrics like CPU and I/O are missing. That said:

  1. Almost 30 seconds is spent compiling this query. That is not unusual for a query with so many outer joins and applies, but it is still a concern.
  2. The execution plan is not optimized for the runtime value of @EntityType_LoadCustomer, leading to cardinality estimation errors in several areas. Embedding the value with dynamic SQL, using sp_execute_sql, or making it sniffable by creating an inner procedure, might be worthwhile. If the plan will never be reused, OPTION (RECOMPILE) might be an option for you (more so if the execution time is significantly larger). You might also consider an OPTIMIZE FOR hint if there is a suitable representative value, or UNKNOWN perhaps. In short (bit late maybe), all the usual solutions should be considered.
  3. The above may help fix the Hash Join spills, if not, that is another area to concentrate on.
  4. There are a number of seeks with residual predicates, which may or may not be expensive. Review your indexes to ensure residual predicates are eliminated as far as possible (you may need to consider filtered indexes in some cases, e.g. where multiple inequality predicates are present).
  5. The single parallel query plan has 24 concurrent branches, reserving 192 threads. Check your query does not wait on thread reservation. Really, this is excessive in any case.
  6. The final merge semi join is many-to-many. This is typically inefficient (statistics I/O worktable I/O would have told us exactly how expensive). You may need to express this condition differently, or provide a uniqueness guarantee somewhere to get a one-to-many merge.
  7. The lazy index spool can likely be removed with proper indexing and care over the types of expressions being compared.

The biggest issue though, is that the query is simply unwieldy. I doubt anyone looks forward to maintaining or tuning it. It consumes vast thread and memory resources, and still performs unacceptably. Many of the intermediate results are small (~165,000 rows is a popular number).

With correct use of temporary tables (perhaps indexed), it should be possible to break this monolith down into something much simpler, that performs better, and is far easier to maintain. I can't do this work for you, but really most of it is the application of common sense and the application of query tuning fundamentals.

Jared Karney 2015-10-09 18:19:44
Do you have any guidelines about adding a clustered index before or after data insertion on a temp table? Also, my goal now is to:
1. Get the data from the source table filtered to limit the results in a temp table
2. Use that limited set (with a clustered index on ID) to join and aggregate the other tables into temp tables so that no aggregations are done in the final query.
3. Optimize indexes for this
4. Removed that silly non-variable variable @EntityType_LoadCustomer and just use the static value

In the end, I will have a final query with fewer joins and no aggregations. Is that about right?

SQLkiwi 2015-10-10 05:45:46
I generally add the clustered index afterward because cardinality estimates for the sort will be better, but you should test it both ways. Yes you seem to be progressing along the right sort of lines. There is a skill in identifying which joins and aggregations should be materialized as an intermediate result set. One size definitely does not fit all.