how to improve performance of a Sql query for Median Calcuation?
Thanks for looking into the below SQL :
Let me give the exact scenario.
CREATE TABLE [LeadMedian](
[LeadStatusId] [int] NULL,
[LeadStatus] nvarchar NULL,
[leadid] nvarchar NOT NULL,
[AccountID] nchar NULL,
[leaddate] [datetime] NULL,
[LeadGeoID] [int] NULL,
[LeadProgramID] [int] NULL,
[CurrentLeadOwnerType] varchar NULL,
[CurrentLeadSalesTeam] nvarchar NULL,
[DaysInStatuS] [int] NOT NULL)
INSERT [LeadMedian] VALUES (1, N'Ready for Qualification', N'00Q6000000bz9ncEAA', N'0016000000I7aA4AAJ', CAST(0x00009FD600000000 AS DateTime), 325, 11141, N'Queue', N'00G60000000vZAxEAM', 559)
INSERT [LeadMedian] VALUES (1, N'Ready for Qualification', N'00Q6000000c0XfvEAE', N'0016000000Fyz3TAAR', CAST(0x00009FDB00000000 AS DateTime), 325, 11383, N'Queue', N'00G60000000vZAxEAM', 554)
INSERT [LeadMedian] VALUES (1, N'Ready for Qualification', N'00Q6000000cbbgLEAQ', N'0016000000FzikkAAB', CAST(0x00009FE900000000 AS DateTime), 325, 12971, N'Queue', N'00G60000000vZAxEAM', 540)
INSERT [LeadMedian] VALUES (11, N'Inactive', N'00Q6000000WQMVaEAP', N'0016000000FzDsMAAV', CAST(0x00009EDF00000000 AS DateTime), 325, 10413, N'Queue', N'00G60000000vZAxEAM', 582)
INSERT [LeadMedian] VALUES (11, N'Inactive', N'00Q6000000WQMVaEAP', N'0016000000FzDsMAAV', CAST(0x00009EDF00000000 AS DateTime), 325, 10413, N'Queue', N'00G60000000vZAxEAM', 224)
INSERT [LeadMedian] VALUES (11, N'Inactive', N'00Q6000000ZBKYvEAP', N'0016000000FwbyfAAB', CAST(0x00009F4E00000000 AS DateTime), 325, 11895, N'Queue', N'00G60000000vZAxEAM', 471)
INSERT [LeadMedian] VALUES (11, N'Inactive', N'00Q6000000ZBKYvEAP', N'0016000000FwbyfAAB', CAST(0x00009F4E00000000 AS DateTime), 325, 11895, N'Queue', N'00G60000000vZAxEAM', 224)
INSERT [LeadMedian] VALUES (11, N'Inactive', N'00Q6000000ZBKYzEAP', N'0016000000NcLG5AAN', CAST(0x00009F4E00000000 AS DateTime), 325, 11895, N'Queue', N'00G60000000vZAxEAM', 471)
INSERT [LeadMedian] VALUES (11, N'Inactive', N'00Q6000000ZBKYzEAP', N'0016000000NcLG5AAN', CAST(0x00009F4E00000000 AS DateTime), 325, 11895, N'Queue', N'00G60000000vZAxEAM', 224)
select * from [leadmedian]
I would calculate the Median based on the above scenario and those values should show in the report like below.
So i would require when i pass any lead id's example '00Q6000000c0XfvEAE','00Q6000000bz9ncEAA','00Q6000000cbbgLEAQ' then the Median value should be 554 and 00Q6000000WQMVaEAP,00Q6000000ZBKYvEAP,00Q6000000ZBKYzEAP for this median value =695.
or if i pass currentleadteam='00G60000000vZAxEAM' and status='Ready for Qualification' then also median should be 554
so if i pass any lead id it should be display the respective Median value.
Thanks in advance and really appreciate your valuable inputs.
I tried below query and working on median calculation but taking long time to run ETL job
WITH CTE AS (
SELECT L.CurrentLeadOwnerType,L.CurrentLeadSalesTeam, l.LEADID,l.DaysInStatus,L.LeadStatus,l.AccountID,l.LeadProgramID,
l.LeadGeoID,l.LeadDate, l.LeadStatusId AS StatusID,
ROW_NUMBER() OVER (PARTITION BY l.leadStatusID,L.CurrentLeadOwnerType,L.CurrentLeadSalesTeam ORDER BY L.Daysinstatus) AS RNK
FROM dbo.LeadMedian L
–WHERE l.LeadID in ('00Q6000000bz9ncEAA','00Q6000000cbbgLEAQ',
— '00Q6000000c0XfvEAE','00Q6000000ZBKYzEAP','00Q6000000WQMVaEAP','00Q6000000ZBKYvEAP')
— AND lv.DaysInStatus<>0
WHERE l.DaysInStatus<>0
), CTEMedian AS (
SELECT Statusid,CurrentLeadOwnerType,CurrentLeadSalesTeam ,CASE
WHEN MAXRNK % 2 = 0 THEN (SELECT SUM(DAYSINSTATUS) FROM CTE WHERE RNK IN (MAXRNK/2,MAXRNK/2 + 1) AND
Statusid =X.Statusid and
CurrentLeadOwnerType=x.CurrentLeadOwnerType and
CurrentLeadSalesTeam=x.CurrentLeadSalesTeam
)
ELSE (SELECT (DAYSINSTATUS) FROM CTE WHERE RNK =(MAXRNK + 1)/2 AND Statusid =X.Statusid and CurrentLeadOwnerType=x.CurrentLeadOwnerType and CurrentLeadSalesTeam=x.CurrentLeadSalesTeam )
END AS [CURRENT MEDIAN]
FROM (
SELECT CurrentLeadOwnerType,CurrentLeadSalesTeam,Statusid,MAX(RNK) AS MAXRNK FROM CTE
GROUP BY Statusid ,CurrentLeadOwnerType,CurrentLeadSalesTeam
)X
), CTEGroup AS (
SELECT b.[CURRENT MEDIAN] as Median ,A.CurrentLeadOwnerType,a.CurrentLeadSalesTeam ,A.LeadStatus
,A.AccountID,A.LeadGeoID,A.LeadDate,A.LeadProgramID,A.StatusID ,A.LeadID
FROM CTE a INNER JOIN CTEMedian b ON a.Statusid=b.Statusid AND A.CurrentLeadOwnerType = B.CurrentLeadOwnerType AND A.CurrentLeadSalesTeam = B.CurrentLeadSalesTeam
), CTEfinal AS (
SELECT * FROM CTEGroup
)
select CurrentLeadSalesTeam,currentleadownertype,
leadstatus,Median
from ctefinal –WHERE CurrentLeadSalesTeam='00G60000000vZAxEAM'
Currently i have in 18567344 leadid's in my database , for calculating these many leadid's is taking long time…is there any performance tuning on the above SQL.
Thanks in advance.
Regards,
Suneel
Something like this
IF NOT EXISTS ( SELECT 1 FROM sys.indexes WHERE object_id = OBJECT_ID('[dbo].[LeadMedian]') AND [name] = 'CurrentLeadOwnerType_CurrentLeadSalesTeam_LeadStatusId' ) CREATE NONCLUSTERED INDEX CurrentLeadOwnerType_CurrentLeadSalesTeam_LeadStatusId ON [dbo].[LeadMedian] (CurrentLeadOwnerType, CurrentLeadSalesTeam, LeadStatusId) INCLUDE (DaysInStatuS) WITH (DATA_COMPRESSION = ROW) SELECT a.CurrentLeadOwnerType , a.CurrentLeadSalesTeam , a.LeadStatusId , CAST(MEDIAN_DaysInStatuS AS INT) AS MEDIAN_DaysInStatuS FROM ( SELECT a.CurrentLeadOwnerType , a.CurrentLeadSalesTeam , a.LeadStatusId , ( SUM(CASE WHEN DaysInStatuS IS NOT NULL THEN 1 ELSE 0 END) - 1 ) / 2 AS DaysInStatuS_offset , 2 - SUM(CASE WHEN DaysInStatuS IS NOT NULL THEN 1 ELSE 0 END) % 2 AS DaysInStatuS_fetch FROM [dbo].[LeadMedian] a GROUP BY a.CurrentLeadOwnerType , a.CurrentLeadSalesTeam , a.LeadStatusId ) a CROSS APPLY ( SELECT AVG(DaysInStatuS) AS MEDIAN_DaysInStatuS FROM ( SELECT p.DaysInStatuS FROM [dbo].[LeadMedian] p WHERE DaysInStatuS_offset IS NOT NULL AND DaysInStatuS_fetch IS NOT NULL AND p.DaysInStatuS IS NOT NULL AND p.CurrentLeadOwnerType = a.CurrentLeadOwnerType AND p.CurrentLeadSalesTeam = a.CurrentLeadSalesTeam AND p.LeadStatusId = a.LeadStatusId ORDER BY p.DaysInStatuS OFFSET ISNULL(a.DaysInStatuS_offset, 0) ROWS FETCH NEXT ISNULL(a.DaysInStatuS_fetch, 1) ROWS ONLY ) b ) b
thanks your inputs.
i executed the attached query and working in SQL server 2012 but not in 2008 R2.
since our production servers are using 2008 R2 .when i apply the above same query and there is an error
Msg 102, Level 15, State 1, Line 36
Incorrect syntax near 'OFFSET'.
Msg 102, Level 15, State 1, Line 36
Incorrect syntax near 'NEXT'
could you please provide me the solution in 2008 R2