App design for ETL type work – one large set or multiple smaller result sets?
Have an application that needs to provide back a result set from a normalized database into a single result sheet that is highly normalized for processing changes in excel.
- Must be pulled on demand
- Users could pull several thousand results at a time.
- Users could select fewer columns for the data, however, most are pulling all columns back on this, so the amount of data going across the network could be pretty heavy at busy times.
- Right now clients are pulling in 200 accounts at time and it's pretty slow with the views and the amount of data coming back.
Actions I'm taking right now
Right now I'm refactoring some of these result sets that are currently views (some with 21 tables). I've been carefully evaluating the statistics io/time and the plans for optimization, breaking some of the initial tables into temp tables, and trying to improve the efficiency with all the nested loops taking place.
From my predecessor, I'm hearing that they found breaking things into smaller chunks took longer overall, but they were using the views, and I'm using some tuned procs… so it's not a 1 for 1 comparison.
With time limited, I'm hoping I could get some general design practice info for optimizing this for this initial performance patch I'm going to release. I can't redo the entire process right now, so I've worked on tuning the procs, and also am considering the amount of results to pull back per account.
It's been running at 200 a pull. I reduced down to 50 and saw an exponential decrease in I/O being processed in the execution plan on some of the nested loops. (ie top 200 had 31 MB in one nested loop, but the top 50 had 5).
Is there a general design practice for these tough ETL type jobs with applications for amount to return back?
One large query vs breaking up into much smaller top50 results?
I'm leaning towards the smaller results due to all the cardinality estimation issues with the massive tables being returned in multiple joins, but also don't want to tank performance for customers that may deal with remote work and latency. I'm just not sure if it's considered less overhead to transmit smaller chunks 10x or 2 large chunks.