BookmarkSubscribeRSS Feed
babu-in
Calcite | Level 5

I have several large datasets(30GB), and try to merge only few variables from second table to first table based on key variable. But since they are too big, it will take a long time. which method is most efficient to do this.

4 REPLIES 4
Kurt_Bremser
Super User

Are both tables of this size, or do you have one large table and a considerably smaller one used for a lookup?

Do you have a many-to-one or a many-to-many relationship?

 

In any case, you need to strive to "travel light", meaning you only keep the variables you need when sorting in preparation for a merge.

babu-in
Calcite | Level 5

yes, tables are same size and keys are unique in the both tables.

Kurt_Bremser
Super User

Then take a closer look at your data. See if the datasets are compressed, and see if you need all columns in your output or if you can drop some.

If in doubt, run proc contents on your datasets and post the output.

 

Also, please post the log of your merge code, using the {i} button (available when posting Rich Text).

Patrick
Opal | Level 21

@babu-in 

On this level the best answer I can give is "it depends".

Assuming both data sets are SAS tables what normally takes the most time is sorting and writing data to disk. That's either a Proc Sort with a data step merge or then a SQL Join. But also the SQL join will sort and write data to disk as intermediary temporary files in UTILLOC.

 

"and try to merge only few variables from second table"

IF you can fit these few variables into memory then most likely a data step hash lookup would perform best as this doesn't require any sort operations and reduces write operations to disk.

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 3674 views
  • 0 likes
  • 3 in conversation