01-02-2013 01:26 PM
I have 8 datasets to combine to get the reports for a project I am working on
what would be the most efficent way to combine these tables?
is it good to add all 8 tables together in one step or add 2 tables with needed columns first and then add one by one???..
Query runs for 30 minutes with 8 tables in one step
01-03-2013 09:49 AM
In general, I agree with Linlin. There are some additional things that you can do to potentially make the process faster. I assume you are using EGuide's query builder task. If you are doing this as a one-off, then 30 minutes is tolerable; if it is part of a production activity, then it may not be acceptable.
01-09-2013 11:44 AM
I just wanted to update this with my experience. It seems query ran faster when tables are added one by one than combining all 8 tables in one step.
Actually ran out of space on work library.
01-10-2013 02:20 AM
Well, from my point of view it's better to hardcode everything.
This will allow you to use certain options that will speed up your merge.
As Doc@Duke mentioned - it's vital for datasets to be sorted by variables which will be the keys for merging in order to decrease calculations AND memory consumption.
Maybe you can just explain your case and we'll see what we can do.
About space on work library. There's an important case when SAS is installed on Windows systems, which has C: drive with several Gb for "System purposes" (with Windows installed on it) and large D: (or any other letter) drive as a main storage appliance. For example, we have IBM blade server which implies this disk partitioning structure: C: 50 Gb D: 500 Gb.
Now, by default SAS assigns WORK library to the folder which is in your "Windows user" folder, which is on drive C:. SAS also use temprorary folders in order to perform sorting and merging (also on C. So in this case, even if you have datasets in libraries, that placed on a large storage (D, if you try to merge or sort it SAS will use space on C:.
In order to "heal" this one can change the configuration file a bit. If you have this problem, I can explain you how to fix configuration file.
01-10-2013 10:54 AM
Thanks for sharing info about work library. I am learning more about the tables and it's contents. Gives me some idea, joining these tables in PROC SQL and reducing the time it takes to run