03-28-2018 12:45 PM
I'm trying to join 17 tables together in SAS using PROC SQL and am running out of system resources. All tables have a simple index on the joined field. I'm thinking I can use the DATA STEP but I need to mimic COALESCE and left joins.
03-28-2018 12:47 PM
I'm trying to join 17 tables together in SAS using PROC SQL and am running out of system resources.
Are you joining all in single query or multiple queries?
03-28-2018 03:21 PM - edited 03-28-2018 03:32 PM
Now I'm trying to run the code in UNIX batch mode and specifying -MEMSIZE MAX and -WORK <some directory> system options.
03-28-2018 03:50 PM
In my experience SAS SQL is not great when trying to join many tables in a single SQL. Depending on what you want to do and especially what volumes and relationships between the tables you're dealing with, some tweaks to the SQL or a SAS data step can reduce resource consumption and/or improve performance a lot.
In case this is one big table with a lot of lookups (i.e. a fact table with dimension and reference tables) then using a SAS data step with hash lookup tables can be much more efficient. Logically left joins and coalesce() can also get implemented in a SAS data step.
03-29-2018 04:29 AM
The problem here is the left joins, since SAS (and other) can optimize the as god as inner joins.
Usually when creating star schemas, dimension values that are missing can be assigned a dummy value, like a dimsnion key -1. Then you can switch to inner joins.
If your data is in Base SAS libraries consider moving to SPDE, which has better index management and some additional use cases for index use in the where planner.