Why doesnt SQL Server just add a predicate in this case?
OUL.OrgUnitId = OUL.SourceOrgUnitId
But the second one is way more expensive. I thought that a query plan of the first query could have been modified to simply include the predicate with the expression above on the one operator. But SQL Server is either costing it as too expensive, or not generating that plan as a candidate to consider.
I'm running SQL Server 2014, SP2. When I use the old cardinality estimator, I still see the issue.
Your query has the following relevant equivalences:
- OUL.SourceOrgUnitId = OUL.OrgUnitId
- OUI.OrgUnitId = GI.GroupId
Applying those to the written 'join predicate' GI.GroupId = OUL.OrgUnitId, the written predicate becomes:
- GI.GroupId = OUL.SourceOrgUnitId; then
- OUI.OrgUnitId = OUL.SourceOrgUnitId
The final form OUI.OrgUnitId = OUL.SourceOrgUnitId relates the tables correctly, but there is no indexable expression predicate to form the basis of an indexed nested loops join into OUL (no index with SourceOrgUnitId as the leading key).
Given multiple (explicit or implicit) equivalences in a query, the optimizer will not explore all possible combinations against all indexes in order to make an indexed nested loops join possible.
As far as I am aware, this is deliberate: it avoids exploding compilation time and search space. Some queries (esp. with self-joins, overlapping predicates, imperfect indexing) may not always find the ideal index plan for all possible query specifications.
In your particular example, we can workaround the issue in more natural ways than given in the question. One is to write a problematic equivalence earlier:
DECLARE @UserId INT = 71710; DECLARE @SectionGroupTypeID INT = 819; SELECT OUL.UserId, OUL.OrgUnitId, OUL.SourceOrgUnitId FROM GROUPS GI JOIN ORG_USERS OUI ON OUI.OrgUnitId = GI.GroupId -- **Derived table with moved predicate** JOIN (SELECT * FROM ORG_USERS AS OU WHERE OU.SourceOrgUnitId = OU.OrgUnitId) OUL ON GI.GroupId = OUL.OrgUnitId WHERE GI.GroupTypeID = @SectionGroupTypeId AND OUI.UserId = @UserId OPTION (MAXDOP 1);
This is a possible practical workaround rather than a general recommendation since it does not rely on documented behaviour. The exact shape of the query tree (and so which predicates appear 'before' or 'after' others at simplification time) depends on numerous factors and implementation details that could change.
CREATE TABLE #T1 (col1 integer PRIMARY KEY); CREATE TABLE #T2 (col1 integer PRIMARY KEY, col2 integer NOT NULL); INSERT #T1 (col1) VALUES (1); INSERT #T2 (col1, col2) SELECT SV.number, SV.number FROM master.dbo.spt_values AS SV WHERE SV.[type] = N'P'; SELECT * FROM #T1 AS T1 JOIN #T2 AS T2 ON T2.col1 = T1.col1 WHERE T2.col1 = T2.col2; WITH T2 AS ( SELECT * FROM #T2 AS T2 WHERE T2.col1 = T2.col2 ) SELECT * FROM #T1 AS T1 JOIN T2 ON T2.col1 = T1.col1; DROP TABLE #T1, #T2;