BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
jbilenas
Obsidian | Level 7

I was reading a May 10th, 2023 SAS-L1 email digest about issues with running mergers with very large data sets (or tables). An email was sent about an individual having long run times with mergers and hash joins and wanted to see how to speed up mergers taking 6 hours or more and HASH joins taking 8 Hours.

 

From my experience, there are many issues over the years where Data Mergers and/or SQL joins were taking too long usually; sometimes 2 or more days of sorting and merging 2 files.  My correction for the 2-day run using USER FORMATS resulted in a 10-minute merge. A common issue in processing epidemiology and consumer data analysis. Using the USER FORMAT reads the larger file without having to sort it and creates a smaller large file that can then be sorted in quicker time.
 
I have posted a LinkedIn post that showed some examples of this issue and added a few references on PROC FORMAT:

 

Jonas V. Bilenas
1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

If the reason for the joins is a simple code/decode lookup then formats (or informats) should always be considered.  And if the mapping is many to one (for example age ranges) then formats should require less memory than a hash object that would required exact matches.

View solution in original post

1 REPLY 1
Tom
Super User Tom
Super User

If the reason for the joins is a simple code/decode lookup then formats (or informats) should always be considered.  And if the mapping is many to one (for example age ranges) then formats should require less memory than a hash object that would required exact matches.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 1 reply
  • 459 views
  • 2 likes
  • 2 in conversation