Hash spills, can it be avoided?
1-) How can I justify the memory needs to be reserved per hash join, is is equally divided between threads?
2-) Can I avoid hash spills in this query, if I do is it worth the gain?
3-) Can I control the hash spills with forcing the optimizer grant more memory coming into hash with any traceflag or any other way?
2. If the server can satisfy an increased memory grant (from the current ~6GB!) then yes, the spill might be avoided. The current spill is only level 1, compared with level 2 previously so the impact of completely avoiding the spill may not be so marked. The only way to know for sure is to test it.
3. There is no trace flag I am aware of that would increase memory grant here. The main tools at our disposal are to increase the estimated number of rows and/or the estimated average size of each row.
The query in question is far outside the optimizer's model, as you can tell by the difficulties you continue to encounter with parallelism and memory grant. Without access to at least a statistics-only copy of the tables concerned, and more likely test-system access, it is quite difficult to make specific suggestions.
My own personal experience with systems that needed queries similar to yours is that the design was one that would always require unsupported SQL syntax tricks and CLR use to produce reasonable performance, as well as demanding an extremely fast (and expensive) I/O subsystem.
Ultimately, this type of system ends up being hard work to write, debug, and maintain. If the company in question is prepared to make that compromise, they will likely have to accept that a great deal of money will have to be spent on consultants with very specific expertise.