Desktop productivity for business analysts and programmers

Release Memory after large number of CALL EXECUTE's?

Reply
Frequent Contributor
Posts: 135

Release Memory after large number of CALL EXECUTE's?

[ Edited ]

I'm using SAS to parse a Zip archive on a UNIX/AIX machine.  Based on certain criteria, I select zero to many files from the zip archive to be unzipped.  Some times the number of files can be up to 60,000.  I then format the file names of the selected files into a SYSTASK command to unzip the file in UNIX and CALL EXECUTE the SYSTASK.

 

Everything works, and all the right files get unzipped in a timely manner, but any subsequent SAS steps run slower than molasses in January in Maine.  I run four SAS data steps after the unzips.  Normally the total run time for all four steps combined is less than five minutes.  However, if I run the unzips first and then the four SAS steps in the same SAS batch job, the four steps drag on and on, taking 10 minutes or more each.  if I break the job in two, then the four steps run in mere minutes total.  

 

Presumably, I'm eating up a lot of memory with all my CALL EXECUTEs, and the lack of memory is causing the dramatic slowness.  Is there a way I can release the memory after I'm done with the CALL EXECUTES?  Each of the SYSTASKS is submitted with a task ID and a task RC, so I know when the unzipping of the files is complete.

 

I could just break the job in two, but the whole purpose of the job is to not have someone sitting there manually checking to see if all the unzips of the files have completed.

 

Jim

Super User
Super User
Posts: 8,267

Re: Release Memory after large number of CALL EXECUTE's?

[ Edited ]
Posted in reply to jimbarbour

If you think that CALL EXECUTE() is the problem then stop using it.  Just write the code to a file and %INCLUDE the file.

filename code temp;
data _null_;
  set zipfiles ;
  file code ;
  put 'systask ' ...... ;
run;
%include code / source2 ;

But I suspect that using a lot of SYSTASK calls is more likely to be the cause.

 

Frequent Contributor
Posts: 135

Re: Release Memory after large number of CALL EXECUTE's?

Ah.  Good idea.  Thank you for that.  Worth a try at least although it could be that as you suspect the SYSTASKs are what is killing memory.

 

VAX/VMS used to have an UNLOAD option for SAS. No such luck with it in UNIX (not a valid option based on the message I got), and OPTION CLEANUP doesn't seem to help any.

 

Is there no way to release memory between steps?

 

Super User
Posts: 4,017

Re: Release Memory after large number of CALL EXECUTE's?

Posted in reply to jimbarbour

Why don't you run your zip archive jobs in batch mode? Then it wont affect later EG session performance and you can also run the batch job split into as many sub-jobs as you like.

Frequent Contributor
Posts: 135

Re: Release Memory after large number of CALL EXECUTE's?

@SASKiwi,

 

I may not be understanding your answer.  I can submit my SAS code either via UNIX at the command prompt or via SAS EG.  Either way, the performance is very poor if the unzips are launched from the same SAS job that I process the unzipped files in.

 

I could split the job, but I was hoping to have a single job that could be launched and have no manual intervention -- which I do except that it's extremely slow on those steps after the unzips.

Super User
Posts: 4,017

Re: Release Memory after large number of CALL EXECUTE's?

Posted in reply to jimbarbour

@jimbarbour- What I was thinking is you could have several batch jobs, each running the same code but processing different batches of your zip archives and controlled by a job parameter to identify the zip batch to run.

Frequent Contributor
Posts: 135

Re: Release Memory after large number of CALL EXECUTE's?

Posted in reply to jimbarbour
OK, so I think I've got a workable idea here:
After my unzips complete, instead of running additional job steps, I submit one more SYSTASK. This SYSTASK initiates a SAS job -- a job that contains the aforementioned four data steps. The four data steps execute in a separate memory space and will complete in a few minutes. The main job will in the meantime have been in a wait state. When the four data steps in the "subordinate" task complete, the main job resumes execution and then simply writes the log file from the subordinate thread into the log of the main thread so as to have all log information in a single location (in my case, typically Enterprise Guide). At end of the main job, I can send out an email notification or what have you -- all steps in all threads having been tracked by the main thread.

A bit of a kludge, but I am certain that it will work. It would be a lot easier if I could just clear the memory.
Ask a Question
Discussion stats
  • 6 replies
  • 204 views
  • 0 likes
  • 3 in conversation