Merge join conundrum
In any case the if you wanted to know the table representing the many in the 1-many relationship has 120 million records. A nested loop group by takes 2 minutes but I decided that since I couldn't get the merge join I isolated the larger table and aggregated it on its own and then joined the smaller table and this took 30 seconds.
The total estimated cost of this plan is 1,687 units. The nested loops plan with the GROUP BY (and no join hint) has an estimated cost of 1,358 units.
Adding the GROUP BY results in a plan with enough CPU cost to convince the optimizer that parallel execution is worthwhile. Parallel merge join has quite a high cost assigned to it because parallel merge join does not scale well with DOP, and may introduce a risk of intra-query parallel deadlocks due to preserved ordering.
Because the parallel nested loops plan costs lower than the merge join alternative, it is chosen by the optimizer.
There is a possibility of a slightly more efficient merge join plan if you use OPTION (MERGE JOIN) rather than the INNER MERGE JOIN hint. Join hints force the order of joins to match the written order just as if OPTION (FORCE ORDER) had been specified. Allowing the optimizer to choose the input order to the merge join may change the plan. I would still expect the Nested Loops plan to be costed lower though.
Optimizer estimates are just estimates, so it can be important to test its assumptions. If the NLJ plan runs faster, use that. Be aware that parallel NLJ plans may have very different performance characteristics with a cold cache. NLJ generally uses a less efficient read-ahead (prefetch) mechanism than hash or merge join.
End of simplification, time: 0.007 net: 0.007 total: 0.007 net: 0.007
end exploration, tasks: 126 no total cost time: 0.011 net: 0.011 total: 0.018 net: 0.018
end exploration, tasks: 565 no total cost time: 0.007 net: 0.007 total: 0.025 net: 0.025
end search(1), cost: 1856.91 tasks: 586 time: 0 net: 0 total: 0.025 net: 0.025
end exploration, tasks: 587 Cost = 1856.91 time: 0 net: 0 total: 0.025 net: 0.025
end exploration, tasks: 897 Cost = 1856.91 time: 0.008 net: 0.008 total: 0.033 net: 0.033
end search(1), cost: 1357.98 tasks: 954 time: 0.002 net: 0.002 total: 0.035 net: 0.035
End of post optimization rewrite, time: 0 net: 0 total: 0.035 net: 0.035
End of query plan compilation, time: 0.001 net: 0.001 total: 0.036 net: 0.036
here is the demo code:
CREATE TABLE tbl1
(
dateid INT NOT NULL ,
itype INT NOT NULL ,
someid INT NOT NULL ,
startdateid INT NOT NULL ,
enddateid INT NOT NULL ,
anotherid INT NOT NULL ,
yetanotherid INT NOT NULL ,
lastid INT NOT NULL ,
avalue DECIMAL(19, 17)
)
CREATE UNIQUE CLUSTERED INDEX i1 ON tbl1(dateid,itype,someid,startdateid,enddateid,anotherid,
yetanotherid,lastid)
CREATE TABLE tbl2
(
dateid INT NOT NULL ,
itype INT NOT NULL ,
someid INT NOT NULL ,
startdateid INT NOT NULL ,
enddateid INT NOT NULL ,
anotherid INT NOT NULL ,
yetanotherid INT NOT NULL ,
lastid INT NOT NULL ,
someval DECIMAL(19, 17)
)
CREATE NONCLUSTERED INDEX id2 ON dbo.tbl2(dateid ,
itype ,
someid ,
startdateid ,
enddateid ,
anotherid ,
yetanotherid ,
lastid ) INCLUDE (someval)
SELECT a.dateid ,
a.itype ,
a.someid ,
a.startdateid ,
a.enddateid ,
a.anotherid ,
a.yetanotherid ,
a.lastid ,
SUM(avalue * someval) result
FROM dbo.tbl2 a
JOIN dbo.tbl1 b ON a.dateid = b.dateid
AND a.itype = b.itype
AND a.someid = b.someid
AND a.startdateid = b.startdateid
AND a.enddateid = b.enddateid
AND a.anotherid = b.anotherid
AND a.lastid = b.lastid
WHERE a.dateid = 20130731
GROUP BY a.dateid ,
a.itype ,
a.someid ,
a.startdateid ,
a.enddateid ,
a.anotherid ,
a.yetanotherid ,
a.lastid