BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Ahmedellaithy
Fluorite | Level 6

Dears,

I have data set named as have contain 3 variables and 1048579 rows and I am trying to run the following code but allows get the following errors :

ERROR: Hash data set load failed at line 85 column 4.
ERROR: Hash object added 0 items when memory failure occurred.
FATAL: Insufficient memory to execute DATA step program. Aborted during the EXECUTION phase.
ERROR: The SAS System stopped processing this step because of insufficient memory.
code:
proc sort data=have;
by acct_id date;
run;

data answers(drop=_: dpd);
set have;
by acct_id date;
retain FWD12mont_default_flg '0' default_Exclusion '0';

/* populate bqr1 to bqr12 and current and FWD default_flg */
if acct_id >= 1 then do;

dcl hash h1(dataset:cats('have (where=(acct_id="', acct_id, '"))'),
ordered:'y');
h1.defineKey('acct_id', 'date');
h1.defineData('dpd');
h1.defineDone();
declare hiter hh1 ('h1');
FWD12mont_default_flg='0';
end;
array _arr_bqr(*) $1 dpd1-dpd13;
call missing(of _arr_bqr(*));
_rc=hh1.setcur(key:acct_id, key:date);

do _i=1 to 13 while (_rc=0);
_arr_bqr[_i]=dpd;

if dpd='9' then
FWD12mont_default_flg='1';
_rc=hh1.next();
end;

if last.acct_id then
h1.delete();

/* populate bqr1 to bqr12 and current and BKW default_Exclusion */
if acct_id >= 1 then do;

dcl hash h11(dataset:cats('have (where=(acct_id="', acct_id, '"))'),
ordered:'descending');
h11.defineKey('acct_id', 'date');
h11.defineData('dpd');
h11.defineDone();
declare hiter hh11 ('h11');
default_Exclusion='0';
end;

/* populate bqr1 to bqr12 and default_flg */
array _arr_Hbqr(*) $1 BKW_dpd1-BKW_dpd13;
call missing(of _arr_Hbqr(*));
_rc=hh11.setcur(key:acct_id, key:date);

do _i=1 to 13 while (_rc=0);
_arr_Hbqr[_i]=dpd;
if dpd='9' then
default_Exclusion='1';
_rc=hh11.next();



end;




if last.acct_id then
h11.delete();
run;
 
 

  

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

If you DELETE the hash at last.acct_id, then you must DECLARE it at first.acct_id.

These actions have to be synchronized.

 

But from your code, I guess hash objects are not what you are looking for.

Use a double DO loop and a date-indexed array. In the first DO, read all observations into the array.

array dpds {'01jan1900'd:'31dec2099'd} temporary;
do i = '01jan1900'd tom'31dec2099'd;
  dpds{i} = .;
end;
do until (last.acct_id);
  set have;
  by acct_id;
  dpds{date} = dpd;
end;

Then, in the second DO loop, re-read the observations, set the flags, and OUTPUT.

do until (last.acct_id);
  set have;
  by acct_id;
  do i = intnx('month',date,-13) to date;
    if dpds{i} = 9 then flag1 = 1;
  end;
  output;
end;

View solution in original post

12 REPLIES 12
PeterClemmensen
Tourmaline | Level 20

The log explains what the problem is. You run out of memory, which gives an error and stops the data step. 

 

I suspect that the real question here is how to accomplish whatever you try to accomplish without the Data Step giving an error?

 

If that is the case, please provide a minimum of sample data and explain what the code above is suppose to do and the desired result. 

Ahmedellaithy
Fluorite | Level 6
 
 
data have;
input date:anydtdte. acct_id:$16. BQR:$1.;
format date date9.;
datalines;
30-Sep-17 1 0
30-Oct-17  1 0
29-Nov-17  1 0
29-Dec-17  1 0
28-Jan-18  1 0
27-Feb-18  1 0
29-Mar-18  1 0
28-Apr-18  1 0
28-May-18  1 0
27-Jun-18  1 0
27-Jul-18  1 0
26-Aug-18  1 0
25-Sep-18  1 0
25-Oct-18  1 0
24-Nov-18  1 9
24-Dec-18  1 0
23-Jan-19  1 0
22-Feb-19  1 0
24-Mar-19  1 0
23-Apr-19  1 0
23-May-19  1 0
22-Jun-19  1 0
22-Jul-19  1 0
21-Aug-19  1 0
20-Sep-19  1 0
20-Oct-19  1 0
19-Nov-19  1 0
19-Dec-19  1 0
18-Jan-20  1 0
17-Feb-20  1 0
18-Mar-20  1 0
17-Apr-20  1 0
17-May-20  1 0
16-Jun-20  1 0
16-Jul-20  1 0
15-Aug-20  1 0
14-Sep-20  1 0
14-Oct-20  1 0
13-Nov-20  1 0
;run;
the above is example of the data that I have and I need to look for 12 months forward and backward based on each observation date and in case the account dpd equal to 9 then to update two flags one in case this happed in the Forward looking and another in case this happed in the backward looking 
13-Dec-20  1 0
12-Jan-21  1 0
11-Feb-21  1 9
13-Mar-21  1 9
12-Apr-21  1 9
12-May-21  1 9
11-Jun-21  1 9
11-Jul-21  1 9
10-Aug-21  1 9
09-Sep-21  1 9
09-Oct-21  1 9
08-Nov-21  1 9
08-Dec-21  1 9
07-Jan-22  1 9
 
 
 
 
 
Ahmedellaithy
Fluorite | Level 6
I was trying to do that using hash option
Ahmedellaithy
Fluorite | Level 6
Is there is any way to optimize my code to overcome the memory issue noting I am using sas studio
mkeintz
PROC Star

You are probably creating a lot more hash objects than you delete.  In particular, you have:

 

  if acct_id >= 1 then do;
    dcl hash h1(dataset:cats('have (where=(acct_id="', acct_id, '"))'),ordered:'y');
      h1.defineKey('acct_id', 'date');
      h1.defineData('dpd');
      h1.defineDone();
    declare hiter hh1 ('h1');
    FWD12mont_default_flg='0';
  end;

and

  if acct_id >= 1 then do;
    dcl hash h11(dataset:cats('have (where=(acct_id="', acct_id, '"))'),ordered:'descending');
      h11.defineKey('acct_id', 'date');
      h11.defineData('dpd');
      h11.defineDone();
    declare hiter hh11 ('h11');
    default_Exclusion='0';
  end;

which creates an H1 and H11 hash object for every observation with an acct_id>1.  When you create a new hash object having the same name as an existing hash object, the memory used by the old object is not released, and its content is no longer accessible.  So you should delete before redeclaring.  Or better yet, just use the CLEAR method, then re-use without declaring a new identically named hash object.

 

Back to your error message" your removal of these hash objects from memory are done via

  if last.acct_id then h1.delete();

and

if last.acct_id then h11.delete();

So if you have multiple observations per acct_id (suggested by your use of set have; by acct_id date;), you are not doing proper housekeeping.    You are constantly increasing memory demands for hash objects to hold content that you can no longer access.  

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Ahmedellaithy
Fluorite | Level 6

Hi Thank you for feedback but I am doing two hash using same info but one is sorted ascending and the other is descending.

 

could you let me know how to use the clear option that you mentioned .

Kurt_Bremser
Super User

You constantly re-declare the hash object(s), but delete them only when last.acct_id is true. This keeps the old objects in memory.

Instead of re-declaring the objects, declare them once at _n_ = 1, and use the CLEAR method to re-initialize them.

 

I suspect you should use

if first.acct_id

instead of

if acct_id >= 1

Since acct_id is character in your dataset, you should also use a character literal in the comparison, if you really want

if acct_id >= "1"
Ahmedellaithy
Fluorite | Level 6
if. First will not be correct as the account will be always flagged every month after first month account meet the flagging rule.
Kurt_Bremser
Super User

If you DELETE the hash at last.acct_id, then you must DECLARE it at first.acct_id.

These actions have to be synchronized.

 

But from your code, I guess hash objects are not what you are looking for.

Use a double DO loop and a date-indexed array. In the first DO, read all observations into the array.

array dpds {'01jan1900'd:'31dec2099'd} temporary;
do i = '01jan1900'd tom'31dec2099'd;
  dpds{i} = .;
end;
do until (last.acct_id);
  set have;
  by acct_id;
  dpds{date} = dpd;
end;

Then, in the second DO loop, re-read the observations, set the flags, and OUTPUT.

do until (last.acct_id);
  set have;
  by acct_id;
  do i = intnx('month',date,-13) to date;
    if dpds{i} = 9 then flag1 = 1;
  end;
  output;
end;
Tom
Super User Tom
Super User

Note that you need to give the ARRAY statement actual integer values for the bounds. And include the _'s in _temporary_.

1    data test;
2      array dpds {'01jan1900'd:'31dec2099'd} temporary;
                   ------------
                   22
                   200
ERROR 22-322: Syntax error, expecting one of the following: an integer constant, *.

ERROR 200-322: The symbol is not recognized and will be ignored.

3    run;

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.TEST may be incomplete.  When this step was stopped there were 0 observations and 1 variables.
WARNING: Data set WORK.TEST was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds


4    data test;
5      array dpds {%sysevalf('01jan1900'd) :%sysevalf('31dec2099'd)} temporary;
ERROR: Too few variables defined for the dimension(s) specified for the array dpds.
6    run;

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.TEST may be incomplete.  When this step was stopped there were 0 observations and 0 variables.
WARNING: Data set WORK.TEST was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds


7    data test;
8      array dpds {%sysevalf('01jan1900'd) :%sysevalf('31dec2099'd)} _temporary_;
9    run;

NOTE: The data set WORK.TEST has 1 observations and 0 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds

 

mkeintz
PROC Star

I would recommend a minor tweak to @Tom 's suggestion for using date values as array bounds.  Instead of

7    data test;
8      array dpds {%sysevalf('01jan1900'd) :%sysevalf('31dec2099'd)} _temporary_;
9    run;

 

you could use double quotes instead of single quotes.  Unlike the single quotes, the double quotes will allow the macro interpreter to examine the quoted value.  If the quoted value is literal text as above, there is no difference (assuming you don't have nested quotes).  But if the quoted value is a macro variable or expression, you can use macro assignments that become usable in the array statement, as in

 

%let date1=01jan1900;
%let date2=31dec2099;

data test;
  array dpds {%sysevalf("&date1"d) :%sysevalf("&date2"d)} _temporary_;
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 3756 views
  • 3 likes
  • 5 in conversation