Hi I am using SAS Viya 4,
I am getting this error when using proc sql with multiple joins in sas viya4.0
ERROR: The SAS System stopped processing this step because of insufficient memory.
Problem: I have 6 tables join in the proc sql, of which 1 table i am using libref to connect to oracle and rest of 5 tables i have loaded to CASLIB from oracle. So the join is happening with inmemory CAS tables and oracle table, since here in this case i cannot use fedsql, i am using proc sql and then i am getting this error.
Sample code:
libname test oracle path="********" schema="*****" authdomain="Oracle.*****" encoding='UTF8';
options casdatalimit=ALL;
PROC SQL ;
CREATE TABLE output1 AS
SELECT * FROM
test.table1 t1, /* this is a full load oracle table with ~80gb size*/
caslib.table2 t3,
caslib.table2 t4,
caslib.table2 t5,
caslib.table2 t2,
caslib.table2 t6
Kindly help.
Thanks
Sai Krishna
There are some ways in which you can make your query more efficient:
This looks like a multiple look-up, for which SQL is the worst tool in the SAS toolbox.
Your use of the asterisk also points to careless coding, which will at least lead to multiple WARNINGs about variables already present in the SELECT.
So if your caslib tables are unique with regard to their keys, and can fit into memory of the Base SAS or Viya process, a single DATA step with multiple hash objects will be the answer.
How many observations do the caslib tables have, and which variables (type, size) do they contain?
the select * is replaced with below query. Also caslib tables are unique .
Sample code:
libname test oracle path="********" schema="*****" authdomain="Oracle.*****" encoding='UTF8';
options casdatalimit=ALL;
PROC SQL ;
CREATE TABLE output1 AS
SELECT
t1.column1,
t2.column1,
t2.column2,
t2.column3,
t2.column4,
t2.column5,
t2.column6,
t2.column7,
t2.column8,
t2.column9,
t2.column11 AS column1name,
t2.column12,
t2.column13,
t2.column14,
t2.column15,
t3.column1,
t2.column1,
/**/
(SUM(t1.column1)) AS columnname
FROM
test.table1 t1, /* this is a full load oracle table with ~80gb size*/
caslib.table2 t3,
caslib.table2 t4,
caslib.table2 t5,
caslib.table2 t2,
caslib.table2 t6
quit;
Hello @saikrishna786
The way SQL execute it follows the following order
1.From 3.Where 3.Group By 4.Having 5.Select 6.order By
Thus from pulls all data till where and join takes place. try rewriting your query may be creating intermediate datasets depending upon resources available to you.
A better way is to implement this query in oracle itself (SQL pass through).
In case you are not passing anything as a filter condition, it would be worthwhile to have your required table exist as a view in oracle.
Do remember server resources are shared by many users concurrently and are not infinite.
You are joining an Oracle table with CASLIB tables so SAS is having to read a complete copy of the Oracle table down to SAS before it can do the joining. This is likely contributing to your memory issues. Since you are only reading five columns from this table in the join, bring down only these columns first into a CASLIB table as a separate query first.
You run out of memory because of the way you are doing
SELECT * FROM test.table1 t1, /* this is a full load oracle table with ~80gb size*/ caslib.table2 t3, caslib.table2 t4, caslib.table2 t5, caslib.table2 t2, caslib.table2 t6
That is called a Cartesian join. That means you take every record from table1 and combine every record in table2 (if table2 has 4 records then you are using 80gb*4 just for the table1 data now plus the size of table2) then take that minimum of 320gb(or whatever) and match every one of those records with every one in table2 (again????? which if still just 4 records is not 1280gb) , then the same multiple times.
Repeat. Any way, AFTER you fill up the hard drive with all of those combinations then the WHERE attempts to filter them down.
I really hope you were trying to hide your actual table names (silly, if your table names are that sensitive you shouldn't be asking any external help) and made some typos repeating Table2.
Otherwise
T3.column4 is also the same as T2.column4, T4.column4, T5.column4 andT6.column4 (and the same for every other column) as they are all the same caslib.table2
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.