First, get to know your data. If there are lots of long character variables, the physical dataset size can become overwhelming. Even if COMPRESS is used, sort utility files will be uncompressed and take a long time to write and read.
If your "child datasets" are in fact lookup tables, these can be handled much better by hash objects than by individual joins. I have often replaced multiple SORT/MERGE or SQL JOIN steps with a single data step that uses multiple hashes in a sequential read of the large dataset, resulting in a BIG performance gain.
Locate the longest running step, run it with options fullstimer, and post the complete log Also show the relevant part of PROC CONTENTS output for the dataset at that point (number of obs and variables, observation size). Take care to edit sensitive information, if that is required.
We need the full log (use options fullstimer) to identify the longest steps, and the limiting factor (disk/network, CPU, or memory) for each step. Or post the longest ones and the previous and next ones if the log is just too long.
It's very likely too that some steps are redundant or can be better written, or steps can be re-organised or combined.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.