DATA Step, Macro, Functions and more

Hash object with Insufficient memory to execute data step program

Reply
Frequent Contributor
Posts: 115

Hash object with Insufficient memory to execute data step program

Hi there,

 

In order to get the benefit of hash object programming for look up, I am trying to get a subset of a data having 3 million records based on the records with unique identifier present in another table with 1.3 million records. While doing so, getting an error message :

Insufficient memory to data step program. The SAS system stopped processing this step because of insufficient memory.

 

Can anybody suggest me to overcome the error or an alternative means to improve efficiency. 

 

Thank you in advance for your kind reply.

 

Regards,

 

 

Swain
PROC Star
Posts: 1,190

Re: Hash object with Insufficient memory to execute data step program

Posted in reply to DeepakSwain

Show us your code

Frequent Contributor
Posts: 115

Re: Hash object with Insufficient memory to execute data step program

data test1.table3 (drop=rc);
length Abstract $32767 Behavior $1 Comment $32767 Grade $1 Histology $4 Laterality $1 Site $4 TEST $20 ;

declare hash hh();
rc=hh.definekey('ID');
rc=hh.definedata('AbsrefID', 'Abstract', 'Behavior', 'Comment', 'DATEUPDATED', 'Grade', 'Histology', 'Laterality', 'ID', 'Site', 'TEST');
rc=hh.definedone();

do until (eof1);
set test1.table1 end=eof1;
rc = hh.add();
END;

DO UNTIL (eof2);
SET test2.table2 END=eof2;
rc = hh.find();
IF rc EQ 0 THEN OUTPUT;
end;
stop;
run;

LOG:

ERROR: Hash object added 1572848 items when memory failure occurred.
FATAL: Insufficient memory to execute DATA step program. Aborted during the EXECUTION phase.
ERROR: The SAS System stopped processing this step because of insufficient memory.
NOTE: There were 1572850 observations read from the data set test1.table1.
NOTE: There were 1 observations read from the data set test2.table2.
WARNING: The data set test1.table3 may be incomplete. When this step was stopped there were 0 observations and 27 variables.

Swain
Frequent Contributor
Posts: 109

Re: Hash object with Insufficient memory to execute data step program

Posted in reply to DeepakSwain
1. Please do not use the table with transaction data(heavier tables) as hash table.
2. The hash is called hash mapping. So mapping tables are used as hash tables.
3. Your program is crashing because, a certain amount of space is allocated for HASH tables, which is breached. We used SQL and indexes instead when we had these issues.
Trusted Advisor
Posts: 1,288

Re: Hash object with Insufficient memory to execute data step program

Posted in reply to DeepakSwain

You want to match records in table1 and table2 by id, and keep only those with id's in both tables, by using a hash object for table 1.  Your program assumes there is only 1 record per ID in table1.  It makes no assumption about table 2.

 

You ran into a memory problem storing table 1 into a hash object.  There are two solutions: (1) increase memory available to the program, or (2) reduce memory demanded by hash.

 

  1. Increase memory
    On my windows machine, if I run "proc options group=memory;run;" as suggested by others, it shows "memsize=2147483648"  (2 gigabytes).   You could start your sas session with more memory.  For instance, if I click on the Start button and type   
        "sas -memsize 4g"
    I have double the memory.

  2. Decreasing hash memory demands. 
    1. If table2 requires less memory than table1 and has only 1 observations per ID, then reverse the rolls of table 1 and table 2.  How do you determine the memory required?   Load a single record into the object (termed a single "data item" in hash lexicon), get the itemsize attribute   ("bytes_per_item=hh.itemsize;") and multiple by the number of unique ID's.

      By the way, I would not ADD one record at a time.  Instead use this declare statement to load it all at once:

      declare hash hh(dataset:"table1 (keep=abserfid abstract behavior comment dateupdated grade histology laterality id site test)");
        hh.definekey ('ID');
        hh.definedata(all:'Y');
        hh.definedone();
      
    2. Still too much memory needed?  You could then use the hash object merely as a pointer to a record id that could then be used in a "set .... POINT= " statement, as here (untested).
      data rid /view=rid;  /* Make pointers - record IDs*/
         set table1 (keep=id );
         rid=_n_;
      run;
      
      data want;
         if _n_=1 then declare hash hh (dataset:'rid');
            hh.definekey('id');
            hh.definedata(all:'Y');
            hh.definedone();
         end;
         set table2;
         rc=hh.find();
         if rc=0;
         set table1 (keep=id abstract behavior ......) point=rid;
      run;      
Respected Advisor
Posts: 4,557

Re: Hash object with Insufficient memory to execute data step program

Posted in reply to DeepakSwain

Your table TABLE1 contains 2 variables with 32KB length so no wonder you run out of memory.

 

What's in your table TABLE2? If this one is smaller then may-be load this one into the hash. And if the relationship between your tables is one-to-many then use the multidata option in the declare statement and the do_over() method when reading from the hash.

 

And just FYI: If you just want to load a table into a hash then you don't need to loop over the table but you can load it directly using code like: dcl hash hh(dataset:'test1.table1'); ....

 

 

Super Contributor
Posts: 500

Re: Hash object with Insufficient memory to execute data step program

Posted in reply to DeepakSwain

Impossible to suggest something without seeing the code and some knowledge about the data and setup of your sas environment. For the later please execute

proc options group=memory;
run;

and post the result using {i} icon.

Frequent Contributor
Posts: 115

Re: Hash object with Insufficient memory to execute data step program

Posted in reply to andreas_lds
proc options group=memory;
run;

LOG shows:

 

SAS (r) Proprietary Software Release 9.4 TS1M3


Group=MEMORY
SORTSIZE=1073741824
Specifies the amount of memory that is available to the SORT procedure.
SUMSIZE=0 Specifies a limit on the amount of memory that is available for data
summarization procedures when class variables are active.
MAXMEMQUERY=0 Specifies the maximum amount of memory that is allocated for procedures.
MEMBLKSZ=16777216 Specifies the memory block size for Windows memory-based libraries.
MEMMAXSZ=2147483648
Specifies the maximum amount of memory to allocate for using memory-based
libraries.
LOADMEMSIZE=0 Specifies a suggested amount of memory that is needed for executable
programs loaded by SAS.
MEMSIZE=107213322240
Specifies the limit on the amount of virtual memory that can be used during
a SAS session.
REALMEMSIZE=0 Specifies the amount of real memory SAS can expect to allocate.
NOTE: PROCEDURE OPTIONS used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds

 

Swain
Ask a Question
Discussion stats
  • 7 replies
  • 191 views
  • 4 likes
  • 6 in conversation