How can I get a parallel plan without forcing QUERYTRACEON 8649

GokhanVarol 2013-03-18 20:54:07

This is a typical query in our environment, scanning large partitioned tables (3300+ partitions over 100 tables with close to 1TB data in them, largest table has 1.5billion rows and widest table has over 500 columns, lots of nullable strings and nullable numerics dates etc).
The partitioned tables are partitioned a CntyCd char(5) and that is the first column on the primary key at the same time.
In this query every table except the driver table on the top was partitioned. The driver table knows the rows (in the rows column) therefore it scans with rows descending to balance the workload going underneath the threads.
All the tables underneath the cross apply (nested loop) are using a top to eliminating parallelism under the nested loop but only to initiate it from the cross apply above.
This query beats our other queries in magnitudes in performance but I am not able to get an parallel query plan without forcing the trace 8649.
The other thing is is there any way to stop to hash join spilling to disk by maybe faking the numbers coming to it using a top or so?
Thank you

link text

Jonathan Kehayias 2013-03-18 21:44:13
Isn't this essentially the same problem you are asking about on your other post?

SQLkiwi 2013-03-20 07:23:10
It is usually possible to get the optimizer to produce a parallel plan using one arcane SQL syntax trick or another, or more usually a combination. Once that is achieved, however, you have to ask yourself how maintainable the final query is, how robust it will be in the face of changing data, and whether it will continue to work when you apply the next cumulative update, service pack, or upgrade to the next version. Is the resulting plan any more stable than the TF 8649 one? Is the query suitable for a plan guide based on the 8649 XML? These are all options, and tough local decisions to make.
GokhanVarol 2013-03-20 14:34:04
I understand. Since the database is difficult to write queries against I was thinking I flatten the data for users (this is that query) and have this convoluted logic only in this query but nowhere else, and the users will get flattened simple table to query. I can also use this result to build a DW cube for quicker querying.
Jonathan Kehayias 2013-03-20 07:34:55
I have to ask, based on the past conversations I've had with you about your schema design, and the types of problems that you have been posting on this site, but have you considered that your underlying database design is the root cause of the problems that you are experiencing, and it would be better to consider changing the design to better suit the way you are actually using the data rather than how the data is initially made available to you for loading into the database? There are a lot of considerations that need to be made about the data model that is implemented, and it really seems to me like the model that you have chosen isn't working for the way that you are actually consuming the data.

As a general rule, I look at ways to refactor the design of the table schema, or at a bare minimum the way that the data is being queried before I would ever look at using hints inside of my code. Anytime that you use hints, to introduce limitations to future changes/improvements that might occur inside of SQL Server that have to be tested. Additionally, anytime that I see a hash or sort spill to tempdb, I look at redesign factors that could help limit the expensive operation leasing to the spill from occuring well before I ever try and come up with workarounds to try and increase the memory grants associated with the operation being performed. It is critical to keep in mind that the memory required for the grant associated with the operation is memory that can't be used by the buffer pool for caching data pages, so you risk having a memory grant reserve memory from the buffer pool and the end result is higher disk I/O that slows down query performance overall as a result, or worse further pressure on the data cache that forces lazy writes of tempdb pages out to disk which may even further degrade performance.

I fully understand that change database schemas is a drastic approach that may require significant rewrites of application code or data loading processes, but in the long term, solving any database design flaws pays the best dividends for performance.

GokhanVarol 2013-03-20 14:15:42
The current database design is basically having 3300+ databases, one for each US county but squeezed into one database. The company puts data together (each corresponding line in the flat files have slightly over 5000+ columns), put them in normalized form, and since the data comes in chunks (this can be between 1gb – 300gb text files that has to be made available in the database quickly, an only staging tables and partition switches made this available). And all this caused 100+ large tables (largest little over 1TB with 700million rows, largest rows table is over 1.3 billion rows).
GokhanVarol 2013-03-20 14:15:55
There are 2 types of clustered keys, Transaction side has one set of clustered keys and Property side is another set.
Joining only transaction tables together mainly is done through merge joins and no problem there, the problem starts joining tables with different clustered key orders, than hash joins combined with 3300+ partitions demand ideal memory over 1.6TB in some cases but it can only get max 150GB or so.
GokhanVarol 2013-03-20 14:16:09
I am trying to drive this cost down by splitting the large joins into the county code (basically break the gigantic join to 3300 small ones), which speeds the query, reduces the memory grant, but by default sql server does not give a parallel plan and I end up forcing with trace 8649.
GokhanVarol 2013-03-20 14:16:19
Little history about the database, the data structure has been around 18 years or so (conversion from mainframe just completed end of last year), our clients, banks, government etc demand the data in the same format they have been getting since they are using our service, for that reason and more the structure of the database is where it is right now, nevertheless this database does not match OLTP , or
GokhanVarol 2013-03-20 14:16:25

DW requirements from what I have seen, it seems like it does not fit SQL Server's model and yes it's sometimes painful to work with it (when data is filtered by partitioned key things are awesome, but querying data national (getting data from all partitions) is extremely demanding in memory and more.