First, get to know your data. If there are lots of long character variables, the physical dataset size can become overwhelming. Even if COMPRESS is used, sort utility files will be uncompressed and take a long time to write and read.
If your "child datasets" are in fact lookup tables, these can be handled much better by hash objects than by individual joins. I have often replaced multiple SORT/MERGE or SQL JOIN steps with a single data step that uses multiple hashes in a sequential read of the large dataset, resulting in a BIG performance gain.
Locate the longest running step, run it with options fullstimer, and post the complete log Also show the relevant part of PROC CONTENTS output for the dataset at that point (number of obs and variables, observation size). Take care to edit sensitive information, if that is required.
We need the full log (use options fullstimer) to identify the longest steps, and the limiting factor (disk/network, CPU, or memory) for each step. Or post the longest ones and the previous and next ones if the log is just too long.
It's very likely too that some steps are redundant or can be better written, or steps can be re-organised or combined.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.