BookmarkSubscribeRSS Feed
avatar
Fluorite | Level 6

I have 8 datasets to combine to get the reports for a project I am working on

what would be the most efficent way to combine these tables?

is it good to add all 8 tables together in one step or add 2 tables with needed columns first and then add one by one???..

Query runs for 30 minutes with 8 tables in one step

thanks

7 REPLIES 7
Linlin
Lapis Lazuli | Level 10

one step is better.

avatar
Fluorite | Level 6

Thank you

Doc_Duke
Rhodochrosite | Level 12

In general, I agree with Linlin.  There are some additional things that you can do to potentially make the process faster.  I assume you are using EGuide's query builder task.  If you are doing this as a one-off, then 30 minutes is tolerable; if it is part of a production activity, then it may not be acceptable.

  1. make sure the data are in the order of the join variables first.  That way PROC SQL won't have to re-sort them to do the join.
  2. if you have done other data transformations in previous tasks, make sure you have saved the data set (if it is a VIEW, then SAS must redo the transformations as part of the query builder task.
  3. keep only the variables that you need.
  4. There are other tips here, but you can't necessarily get to them in Query Builder.  http://support.sas.com/documentation/cdl/en/sqlproc/62086/HTML/default/viewer.htm#a001360977.htm

Doc Muhlbaier

Duke

avatar
Fluorite | Level 6

I believe tables are from production system. I will try first takeing out order by clause

Thanks

avatar
Fluorite | Level 6

I just wanted to update this with my experience. It seems query ran faster when tables are added one by one than combining all 8 tables in one step.

Actually ran out of space on work library.

ghastly_kitten
Fluorite | Level 6

Well, from my point of view it's better to hardcode everything.

This will allow you to use certain options that will speed up your merge.

As Doc@Duke mentioned - it's vital for datasets to be sorted by variables which will be the keys for merging in order to decrease calculations AND memory consumption.

Maybe you can just explain your case and we'll see what we can do.

About space on work library. There's an important case when SAS is installed on Windows systems, which has C: drive with several Gb for "System purposes" (with Windows installed on it) and large 😧 (or any other letter) drive as a main storage appliance. For example, we have IBM blade server which implies this disk partitioning structure: C: 50 Gb 😧 500 Gb.

Now, by default SAS assigns WORK library to the folder which is in your "Windows user" folder, which is on drive C:. SAS also use temprorary folders in order to perform sorting and merging (also on C:). So in this case, even if you have datasets in libraries, that placed on a large storage (D:), if you try to merge or sort it SAS will use space on C:.

In order to "heal" this one can change the configuration file a bit. If you have this problem, I can explain you how to fix configuration file.

avatar
Fluorite | Level 6

Thanks for sharing info about work library. I am learning more about the tables and it's contents. Gives me some idea, joining  these tables in PROC SQL and reducing the time it takes to run

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1431 views
  • 0 likes
  • 4 in conversation