DATA Step, Macro, Functions and more

ERROR: Insufficient space in file. is damaged. I/O processing did not complete.

Reply
Super Contributor
Posts: 673

ERROR: Insufficient space in file. is damaged. I/O processing did not complete.

proc sort data=table1 out=_a nodupkey; by acctnbr; run;
proc sort data=table2 (keep=acctnbr r&questpin r&questmatch) out=_b nodupkey; by acctnbr; run;
data table3;

    merge _a(in=in_a)
      _b(in=in_b);
by acctnbr;
if in_a;
if in_b then quest='Y'; else quest='N';
run;

1. The above two datasets are huge,they have over 100MM records. and I get the above error message. This is run on Unix. Is there any turn around to fix the issue>?

2. How can this be done in SQL, to avoid sorting and merging,

Help is appreciated,

Super User
Posts: 7,762

Re: ERROR: Insufficient space in file. is damaged. I/O processing did not complete.

How do you think that SQL would accomplish this task? It can't ask $DEITY for the result, so it resorts to more or less the same technique (sorting and merging), but very often less efficient than most code that you write.

Your problem is not enough disk space (or user quota) in the location of your work directory.

Do you run out of space during sorting or during the merge?

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Super Contributor
Posts: 298

Re: ERROR: Insufficient space in file. is damaged. I/O processing did not complete.

There are several ways to merge the data sets without sorting one or both the data sets. The size (100 M) seems to be large. Give some example data sets and the desired WANT data set

then suitable method(s) can be shown.

Super Contributor
Posts: 673

Re: ERROR: Insufficient space in file. is damaged. I/O processing did not complete.

Well, I want all the columns from table1 and the columns from table2 (shown in keep options).

The only help needed is how to translate this datastep if conditions to sql.

if in_a;

if in_b then tag_found_quest='Y'; else tag_found_quest='N';

Super User
Posts: 7,762

Re: ERROR: Insufficient space in file. is damaged. I/O processing did not complete.

This would have to look like:

proc sql;

create table table3 as

select

  a.*,

  b.r&questpin,

  b.r&questmatch,

  case

    when b.acctnbr is missing

    then 'N'

    else 'Y'

  end as tag_found_quest

from table1 a

left join table2 b

on a.acctnbr = b.acctnbr

;

quit;

But be aware that, for this to work, you need enough space to hold

- table1

- table2

- sizeof (table1+table2) (roughly) for the SQL utility file

- table3

in your work location, which will most probably lead to another disk full condition.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Super User
Super User
Posts: 7,942

Re: ERROR: Insufficient space in file. is damaged. I/O processing did not complete.

Maybe also try piecemeal'ing your data, i.e. reduce the data you are working on to only one site or account block at a time and do lots of smaller parts. 

Trusted Advisor
Posts: 3,212

Re: ERROR: Insufficient space in file. is damaged. I/O processing did not complete.

SQL will is as fast in generating this kind of errors, may be even faster as it not tuned to be using resources carefully. Going to an external DBMS you are also getting similar ones. Often there is a limit, very low one, on size of temporary tables as of preventing users to use too much.

Well you are running out of resources. There are two options to solve that.

- program your code in a smarter way to avoid that usage

- Add additional resources.

By the way 100M records is telling nothing it could be that small as just 1 or 2 Gbyte. A better measure aside the number of records is the total size in bytes /  the size of each record.   

---->-- ja karman --<-----
SAS Employee
Posts: 17

Re: ERROR: Insufficient space in file. is damaged. I/O processing did not complete.

You can alleviate the disk space condition by using another storage device and separating the data sets from the utility files by either placing the input/output data sets in permanent storage (define and specify a library) instead of WORK and/or specifying a separate location for the utility files using the -utilloc option (this presumes that the multi-threaded SAS sort is being invoked by SORT and SQL - you can check by setting the MSGLEVEL to i (e.g., options msglevel=iSmiley Wink.  Either strategy will likely help performance as well.

Super Contributor
Posts: 298

Re: ERROR: Insufficient space in file. is damaged. I/O processing did not complete.

I do not have SQL head. Do you accept a data step solution which require no sorting?

Respected Advisor
Posts: 4,173

Re: ERROR: Insufficient space in file. is damaged. I/O processing did not complete.

If disk space for WORK is the issue then you could use one or several of below approaches:

- Before the sorting delete all tables in WORK which are no more required

- Don't create a new table but replace the existing table with a sorted version "proc sort data=table1 out=table1 nodupkey; by acctnbr; run"

- In case  table2 with the required variables (keep=acctnbr r&questpin r&questmatch) fits into memory: Don't sort your tables at all but load table 2 into a hash and then use a hash look-up instead of a merge.

- Write the sorted tables (out=) and table3 to a permanent data set using a libref which points to a file system with sufficient space.

- Make sure that UTILLOC points to another disk than WORK

You could use code as below. That would avoid creating additional tables in work. You would still need sufficient space in UTILLOC during sorting.

proc sort data=table1 nodupkey; by acctnbr; run;
proc sort data=table2 (keep=acctnbr r&questpin r&questmatch) nodupkey; by acctnbr; run;
data table1;

    merge table1(in=in_a)
      table2(in=in_b);
by acctnbr;
if in_a;
if in_b then quest='Y'; else quest='N';
run;

Super User
Posts: 7,762

Re: ERROR: Insufficient space in file. is damaged. I/O processing did not complete.

data table1;

    merge table1(in=in_a)
      table2(in=in_b);
by acctnbr;
if in_a;
if in_b then quest='Y'; else quest='N';
run;

This will not prevent the file full condition, since table1.sas7bdat and table1.sas7bdat.lck will exist concurrently until the data step finishes execution.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Trusted Advisor
Posts: 3,212

Re: ERROR: Insufficient space in file. is damaged. I/O processing did not complete.

Except that the OP's is focused in believing SQL is solving his big data issue, we did not get any additional information on sizing release the logical question behind that join.

---->-- ja karman --<-----
Trusted Advisor
Posts: 3,212

Re: ERROR: Insufficient space in file. is damaged. I/O processing did not complete.

At the moment just for background knowledge.

- When the condition is just deleting some records marked by a condition and not adding new variables.

- Additional storage resources are not easy to get. (really big data)

- You have backup/restore and checkpoint restart processes reviewed on needed and time.

You can do an update in place.

SAS(R) 9.4 Statements: Reference, Third Edition (modify statement), SAS(R) 9.4 Statements: Reference, Third Edition (remove statement).

This is the same approach a RDBMS would do in an OLTP approach. Marking records as being deleted, not replacing all datablocks. This is what SQL was designed for. 

If that is what you think is needed as "SQL" you can code in that way. The common SQL usage by analists is not done like that but based on copying complete tables.   

---->-- ja karman --<-----
Ask a Question
Discussion stats
  • 12 replies
  • 14180 views
  • 0 likes
  • 7 in conversation