Solved: ERROR: Insufficient space in file WORK

parmis · Posted 01-23-2018 03:33 PM

I'm running a very simple inner join query in SAS , and it gives me the "insufficient space in work.SASTMP" error. My work directory is empty, and I'm not even using it. All my SAS data sets are in my C drive which has more than 200G space.

Can anyone help me with that? Again my libraries are in my C drive.

Thanks

Kurt_Bremser · Posted 01-24-2018 08:18 AM

No miracle you run out of space:

proc sql outobs=10;
create table CPI.combined as
select distinct
 a.ACCTNUM,
a.CUST_ID,
b.SourceACCT,
b.CUST_ID
from CPI.CUST_profile_info a,
CPI.CUST_DEC_PRO b
where a.CUST_ID ^= b.CUST_ID;
quit;

This means that every observation in cust_profile is joined with ALL observations of cust_dec_pro that have a different cust_id.

If the datasets have distinct cust_id's, you end up with 900000 * 599999 observations (539,999,100,000)

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

View solution in original post

Kurt_Bremser · Posted 01-23-2018 03:42 PM

Even a slight mistake can cause SQL to run out of space, so we need to see the code. It's also important to know the data, so a quick overview (number of observations & variables, observation size) of the datasets will be helpful. That you do not use WORK does not mean much, as proc sql builds its utility file(s) there, and that seems to be the problem. Is WORK also located on the C: drive?

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

parmis · Posted 01-24-2018 08:06 AM

I'm trying to submit the following query. I have also tried other queries and still get the same error. yes, WORK is located on my C: drive, but I have plenty of space there (more than 200 GB). one of my tables have 900000 observations and is 1.2 GB.The other one has 600000 observations and 412MG. the tables include personal information bout customers.

proc sql outobs=10;

create table CPI.combined as

select distinct

a.ACCTNUM,

a.CUST_ID,

b.SourceACCT,

b.CUST_ID

from CPI.CUST_profile_info a,

CPI.CUST_DEC_PRO b

where a.CUST_ID ^= b.CUST_ID;

quit;

Kurt_Bremser · Posted 01-24-2018 08:18 AM

No miracle you run out of space:

proc sql outobs=10;
create table CPI.combined as
select distinct
 a.ACCTNUM,
a.CUST_ID,
b.SourceACCT,
b.CUST_ID
from CPI.CUST_profile_info a,
CPI.CUST_DEC_PRO b
where a.CUST_ID ^= b.CUST_ID;
quit;

This means that every observation in cust_profile is joined with ALL observations of cust_dec_pro that have a different cust_id.

If the datasets have distinct cust_id's, you end up with 900000 * 599999 observations (539,999,100,000)

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Kurt_Bremser · Posted 01-24-2018 08:25 AM

PS do you want to find matches or non-matches on a join by cust_id?

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

parmis · Posted 01-24-2018 08:49 AM

Yes, this is exactly what I'm trying to do. I was using the inner join first then I changed it to where statement.

I have also tried to submit the following query just to see if it works, but I still get the same error message after an hour.

proc sql outobs=10;

create table CPI.test as

select distinct

a.CUST_ID,

a.ACCTNUM

from CPI.CUST_INFO;

quit;

Kurt_Bremser · Posted 01-24-2018 08:57 AM

SQL is notoriously bad when it has to do sorting on big tables.

Use this instead:

proc sort
  data=cpi.cust_info (keep=cust_id acctnum)
  out=cpi.test
  nodupkey
;
by cust_id acctnum;
run;

If you want to discover cust_id's in one dataset that are not present in the other, use a data step merge.

If you need to find matching cust_id's with differences in other variables, it can also be achieved in a data step merge.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

parmis · Posted 01-24-2018 12:26 PM

Thank you so much, it works.

you're the best 🙂

Reeza · Posted 01-23-2018 03:48 PM

Are you using SAS UE?

parmis · Posted 01-24-2018 08:07 AM

No, I'm using SAS PC

ERROR: Insufficient space in file WORK

Re: ERROR: Insufficient space in file WORK

Re: ERROR: Insufficient space in file WORK

Re: ERROR: Insufficient space in file WORK

Re: ERROR: Insufficient space in file WORK

Re: ERROR: Insufficient space in file WORK

Re: ERROR: Insufficient space in file WORK

Re: ERROR: Insufficient space in file WORK

Re: ERROR: Insufficient space in file WORK

Re: ERROR: Insufficient space in file WORK

Re: ERROR: Insufficient space in file WORK

Registration is open

SAS Training: Just a Click Away