It works but not sure how does it work
What I cannot get my head around though is even though the first join has a top 1 in it it still manages to parallelize into 6 threads in equal chunks. At first I thought I had to find out the servers maxdop and put that many records into the temp table and do a join like (INNER JOIN VTUTest.dbo.VTUInput_24031 AS v ON v.[Row] % @maxdop = pf.[Row]) which also worked but has more lines of code and more variables.
Anyhow the code works so far (suggestions always more than welcome, I am trying things out here = I think I know what I am doing but I know in some time I will realize I had no clue 🙂 ) but how can a 1 row table scan can end up in a nicely threaded query?
DECLARE @This_Is_to_Fake_the_optimizer_to_get_a_parallel_query_plan_Its_Value_Is_0 INT = 0 SELECT v.[Row] ... FROM ( SELECT TOP ( 1 ) [Row] FROM VTUTest.dbo.ParallelFeed_24031 ORDER BY [Row] ASC ) pf INNER JOIN VTUTest.dbo.VTUInput_24031 AS v ON SIGN(v.[Row] ) -1= pf.[Row] OUTER APPLY ( SELECT * FROM aTrans.ACLRFUNCTION(Inputs) ) vtu WHERE @This_Is_to_Fake_the_optimizer_to_get_a_parallel_query_plan_Its_Value_Is_0 = 0 ) AS vtu OPTION ( RECOMPILE, OPTIMIZE FOR ( @This_Is_to_Fake_the_optimizer_to_get_a_parallel_query_plan_Its_Value_Is_0 = 1 ) )
The Distribute Streams exchange operator between the Top (1) and Nested Loops Join is running in Broadcast mode. The single row is sent (copied) to all parallel threads. Your plan is an example of a very special situation: the inner side of a nested loops join generally runs serially (per thread) but where there is a guarantee that one row arrives at the outer input, and there are no correlated parameters on the loops join, the query processor can broadcast the row and use parallelism on the inner side of the join. The clustered index scan is a proper parallel scan, hence the nice distribution of work.