Why Does the First Plan run much faster than the second?
The second derives the value(s) (I have checked many times) from a table.
These were run on test data. The slow one runs up to 1 hour in production. The second 1-3 minutes tops.
Statement1 in 967-PillsList_Anonymized.queryanalysis (fast)
Statement4 in 968-tmp556a_anonymized.queryanalysis (slow)
It is hard to give the direct recommendation on the anonymized plans and without knowing the situation. However, I may suppose, that the reason is – different estimates for the constants and "derived" values.
For the hard coded values optimizer may use distribution histogram directly to estimate the number of rows. For the derived values there are several strategies depending on what "derive" means. If these values are calculated and placed in the local variables than the optimizer doesn't know them before the query is executed (i.e. they are unknown at the compilation time) and it has to guess. In that case you may try to use option recompile to let the optimizer know the values during the runtime. In that case, expect the extra overhead for the re-compilation.
If the values are calculated in the subquery as a part of the whole query, then the optimizer compiles the query as whole. From my point of view, there is no sense to compare such query with one with constants, because the optimizer might consider totally different strategies and steps to optimize this query. Maybe it would be more convenient to materialize intermediate results into temporary tables, for example.
The last thing, I'd like to say, that you told these plans are from the test environment. To be sure that your efforts would make sense in production you should be sure that the DBs and servers are identical and the problem is absolutely reproducible (and is the same) in the test environment as in production. Otherwise, I'd recommend grabbing the actual problem plans from production and analyze them. Good point to start is to find the difference between the actual and the estimated number of rows.
In the Plan Diagram, we also see that there are two Index Scans. (If you rt-click on the plan diagram background, then select Costs By >> I/O, they'll really pop out at you). Both of these scans are on Table1.Index2. It's possible that the code is doing a wildcard search with 'myval%'. So it has to scan the whole index to find the value. Depending on the version of SQL Server, this can cause the optimizer to choose really bad execution plans.
I also see lots of Hash joins and index Spools. This usually indicates a rather complicated set of table joins. It may be possible for you to achieve much better overall performance be deriving intermediate temp tables rather than cobbling the whole thing together in one really big query.
Of course, without the T-SQL code. I'm really only guessing. Hope that helps,