Why spill to TempDB
Both – estimated data and estimated records are slightly greater than the actual. Ok, [kontokorrent_einzel] is a big table (6.5 mio records), so the data size is about 1 GB, but the server has 128 GB, so this should not be a problem.
BTW: the query runs ~ 15 min and I have no idea, how to make it faster. Originally it were all INNER JOINs, but (although I have actual statistics) it "believes" that it will return only ~200k when I use INNER JOINs. Without the forced HASH JOINs the query will need ~ 30 min (because it makes ~6 mio NESTED LOOPs to each of the both joins to t_termine)
The answer lies on the Plan Tree tab, which shows the number of rows processed by each thread at the hash join in question:
Memory is reserved for the hash join on the basis of the expected number and size of the rows. This reserved memory is acquired at the beginning of query execution. It cannot be increased once execution has started, no matter how much free memory the instance might have available at that time. This is by design.
For a row-mode parallel hash join, the memory grant is also evenly split between the hash table instances. With four threads, this means 25% each. Where the rows are skewed, with more rows ending up on one thread than another, the thread that exceeds its allowance will spill. Memory cannot be shared between threads.
In your plan, rows are allocated to hash join threads using hash partitioning of the vertragsig column (the join key). This information is shown on the exchange (Parallelism) operators below each input to the hash join.
When the next service pack for 2014 ships, it will likely include the plan improvements shipped for 2012 SP3, which shows more detailed spill information, including which threads spilled in a parallel execution.
You should not use join hints like LEFT HASH JOIN. This does not simply force a hash join, it also prevents the optimizer from reordering joins or swapping inputs, exactly as if you had added an OPTION (FORCE ORDER) hint.
If you want to force only hash joins for this query, use OPTION (HASH JOIN) instead.
This partly explains why your query is so slow: you are forcing SQL Server to access the tables in a particular order, and forcing it to build the hash table on the much larger inputs. This is wasteful (reserving 5.6GB of memory) and inefficient.
Remove the LEFT HASH JOIN hints and replace them with OPTION (HASH JOIN).