I'm trying to join 17 tables together in SAS using PROC SQL and am running out of system resources. All tables have a simple index on the joined field. I'm thinking I can use the DATA STEP but I need to mimic COALESCE and left joins.
@tomcmacdonald wrote:
I'm trying to join 17 tables together in SAS using PROC SQL and am running out of system resources.
Are you joining all in single query or multiple queries?
Now I'm trying to run the code in UNIX batch mode and specifying -MEMSIZE MAX and -WORK <some directory> system options.
In my experience SAS SQL is not great when trying to join many tables in a single SQL. Depending on what you want to do and especially what volumes and relationships between the tables you're dealing with, some tweaks to the SQL or a SAS data step can reduce resource consumption and/or improve performance a lot.
In case this is one big table with a lot of lookups (i.e. a fact table with dimension and reference tables) then using a SAS data step with hash lookup tables can be much more efficient. Logically left joins and coalesce() can also get implemented in a SAS data step.
The problem here is the left joins, since SAS (and other) can optimize the as god as inner joins.
Usually when creating star schemas, dimension values that are missing can be assigned a dummy value, like a dimsnion key -1. Then you can switch to inner joins.
If your data is in Base SAS libraries consider moving to SPDE, which has better index management and some additional use cases for index use in the where planner.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.