BookmarkSubscribeRSS Feed
jklein_271
Calcite | Level 5

I'm running a relatively simple macro that essentially uses the SAS XML libname engine to consume 10's of thousands of xml files and insert them into a few datasets.  The code itself is working fine with no errors.  The problem I'm having is that after a full load (~75k macro iterations) completes, SAS doesn't appear to want to let go of a large chunk of memory.  I've looked over my code numerous times and I've literally disconnected, deleted, and cleared everything I can think of.  From what I can tell, SAS is holding onto a small amount of memory after each loop which isn't even noticeable after a few thousand iterations.  If I close the session, obviously the memory is released and all is well.  Obviously, I'd like to find a way to prevent this accumulation of memory allocation so that it isn't slowly sucking up almost all of my memory available until it finishes and I can close the session.


What I've tried:

  • clear the libname connecting to the XML file after each loop
  • clear the filename associated with the XML file after each loop
  • Use X command to delete the XML file after each loop

The theory I then had was maybe I was blowing up the mac var meta since I do submit a call symput within each iteration.  However, it's always called the same thing so I know I'm overwriting the same mac var.  In addition, it's a local mac var that is killed after my macro completes.  I'm basically out of ideas so if anyone can think of where SAS could be hiding ~11.9 GB of memory allocation while sitting idle, it would be greatly appreciated.  I've attached a small picture showing the SAS.exe memory usage overlayed ontop of my clear task status pane.


sas_exe_mem.jpg
14 REPLIES 14
ballardw
Super User

I don't know if this is related but SAS will keep (or try at least) some recently used datasets in memory.

The system option MEMSIZE may be involved.

Ksharp
Super User

Can you assign a single libname for all of these xml files . like :

libname x xml ( 'c:\a.xml' 'c:\b.xml'  ....) ;

or

libname x xml 'c:\a.xml' ;

libname x xml 'c:\b.xml' ;

........

Ksharp

Geraldo
Fluorite | Level 6

olá.

Se for em PC provavelmente está resolvendo na memória do seu pc e está havendo uma um swap de memória RAM como a utilização do SAS Add-in for Microsoft Office.

Em um  momento a memória do seu PC não suporta mais...

Sou brasileiro

Geraldo

FriedEgg
SAS Employee

As KSharp points out, it would probably bee a good idea to concatenate some of the xml files into a single library and avoid a few of those iterations if possible. 

A thought I had, which may be far off, is that you are filling up your memory with log information in the buffer.  Check your option for LOGPARM, you may want to run the program as a batch job to avoid the overhead of the editor for holding what could be a large amount of log and output information as well.  With 75k of iterations through an unknown amount of code that is presumably running multiple data steps you are talking a huge amount of log data.

jklein_271
Calcite | Level 5

Thanks for all of the replies.  Hopefully I can respond to each and maybe get somewhere here.  I'm about to run this process again.

ballardw:

I think I'm good as far as config settings go.  I have 16 GB on the machine and set the following after reading a number of postings and articles on SAS mem config settings.  If anyone has any suggestions on better ways to tweak this, please let me know.  I'm all ears.  I don't think this is my problem, but I'm willing to try anything.

-MEMSIZE 16G

-REALMEMSIZE 12G

-SORTSIZE 4G

Ksharp:

I'm not sure if a single libname for multiple files will help me in this case.  As each xml file contains large amounts of data across many datasets, I'm consuming each file and inserting into "base" tables on each loop.  In my mind, this was really the only way to do it.  I also thought this approach would have the least amount of overhead.

Geraldo:

I don't think this is a swap issue.   I"ve monitored the memory pretty closely and it doesn't appear to ever spill over into swap.

FriedEgg:

The log is an interesting theory.  I'll have to give that some thought.  However, I will say that I am outputting the log and output to an external file.  The resuling log file is over 6 GB and I've been using ultraedit to open it.  Since I'm outputting the log and output outside of EG, I can't imagine it would be storing anything internal to the app.

Doc_Duke
Rhodochrosite | Level 12

EG, as in Enterprise Guide?  I believe that it keeps a copy of the log file as part of its internal documentation.  You should be able to tell if you save the project and the .egp file size balloons up.

Another possibility is in the macro itself.  If you have any macro variables, they may be hanging around in memory, especially if the macro repetitively calls other macros.

Doc Muhlbaier

Duke

jklein_271
Calcite | Level 5

Thanks for replying, Doc.

Yes, Enterprise Guide.  For the record, I'm running EG 4.3 (need to download 5 from the depot) and Base 9.3.  I don't believe logging is the issue.  The egp was 65k before execution and 75k after saving after macro execution.  I've made sure that project logging is turned off and the log tab has basically one page of log output as it then outputs to a .log file.  You do bring up a good point though as far as EG overhead in general.  I will run this tomorrow just in Base and see if I see any difference in memory.

I also keep coming back to macros, but I can't find any evidence of it holding onto memory.  To summarize quickly the macro (as far as mac vars and looping is concerned) does the following:

  • outer loop with an iteration for each xml file (let's say ~60k files)
    • file_name mac var created using data _null_ / point / call symput feeding off a dataset of all the XML file names
    • XML libname submitted for XML file using filename mac var
  • inner loop with an iteration for each dataset in the XML library (let's say ~20 per XML file
    • table_name mac var created using data _null_ / point / call symput feeding off of dictionary.tables for the XML library
    • proc datasets append each dataset to a permanent dataset
  • clear libname for the XML library
  • clear filename for the XML file
  • X command delete XML file
  • clean up all datasets in WORK

I could maybe understand memory growing incrementally if i created a different mac var for each iteration (file_name1-60000 and table_name1-20) but even then I would expect the memory to drop off after the macro completed as these local mac vars would die after macro execution.  That being said, I'm "overwriting" both mac vars at the beginning of each respective iteration.  The memory growing to 12 GB and holding after completion just doesn't add up to me.  I've checked %put _all_ and the GUI presentatio of V_MACRO and I'm not seeing anything in there I wouldn't expect to see.  As far as user defined macros, only the 3 or so I've submitted at the top of my code (before the macro) remain.

I have the EG project up, sitting idle, and with the performance monitor up showing the 12 GB holding.  If anyone has any ideas, I'm open to trying anything.

FriedEgg
SAS Employee

Since you are submitting the code through EG to a local server, try running the program directly as a batch job instead, such as:

C:\path\to\sas.exe -batch -logparm "write=immediate rollover=100M" -sysin "C:\path\to\pgm.sas" -log "C:\path\for\log.log" -lst "C:\path\for\list.lst"

* include -xcmd if necessary

I cannot remember off the top of my head how to alter the system options for a local server on EG4.3, or I would suggest how to make the logparm adjustment there.  Does EG read the same configuration files as it normally would when using a local server connection?

With a log file of 6GB in size, I would suspect you find this helps reduce the memory usage.  The problem really seems to fit, to me.  You may also try reducing the amount of logging information you are creating, by removing things like MPRINT, SYMBOLGEN, MLOGIC, etc...

FriedEgg
SAS Employee

I had another thought.  It has been a while since I have used SAS on Windows, but I recall it having the ability to define the WORK directory as a in-memory library by use of an option.  If you are collecting all of your data to the work directory and this option is enables it could cause the issue you are seeing.  Would explain why it is growing over time as the additional files are appended and why it holds the memory even after the program completes executing.  The last piece is the confounding piece, I do not think the log would hold the member after the buffer has cleared and it is running in batch mode, like through EG.

http://support.sas.com/documentation/cdl/en/hostwin/63285/HTML/default/viewer.htm#win-sysop-memlib.h...

proc options option=memlib; run;

memlib is also available as a libname option so check there as well.

SASKiwi
PROC Star

As well as running the job in batch mode I suggest you use the SAS option FULLSTIMER. What this will do is print memory usage stats into your log for each step and may help you find where your memory balloons out. The option will slow your job down so remember to remove it after testing.

jklein_271
Calcite | Level 5

Again, thanks for all of the comments.

FriedEgg:

Executing in batch is definitely a good suggestion.  I'll actually just try executing in Base before I go all the way to batch.  I don't think base is going to be any better than EG as the process holding onto all of the memory is sas.exe and not seguide.exe.  The logparm is a good suggestion.   When just using EG as a an extra layer ontop of base via local server, it feeds off of the same cfg file as base 9.3 on session start.  One exception here is that I don't think the logparm system option applies to EG, but I will set it to immediate if it isn't already before I submit the loop in base.  As far as reducing the log file size, I always have nosymbolgen and nomprint set.  I will also set nomlogic. It'll certainly shed some weight, but I'm sure it'll still be a hefty log file.

The WORK library was one of the first things I targeted.  Even though I don't retain anything in WORK (I delete the one dataset resulting from my loop at the end of each loop), I ran a proc datasets kill work any way when I first noticed this issue.  I was honestly so confused as to how it took a session close to release the memory that I tried to recreate as much of a session close as I could without actually exiting the program.  Needless to say, the proc datasets kill deleted nothing (as nothing was there) and the memory didn't move an inch.  As an FYI, memlib is already set as NOMEMLIB.

SASKiwi:

I always have FULLSTIMER set and was looking at the memory allocation in the log early on.  Unfortunately, all I can see is a very slow, incremental memory increase as I page down the log through iterations.  I would love if it just jumped from 3 GB to 12 GB at a specific point as it would make this MUCH easier to isolate.

Cynthia_sas
SAS Super FREQ

Hi:

  At this point, I'd recommend that you open a track with Tech Support. I really doubt that macro variables are the culprit here. Possibly there is something that is grabbing memory and not letting it go, Tech Support would be the ones best qualified to help you figure out what is causing the issue.

cynthia

  To open a track with Tech Support, fill out the form at this link:

http://support.sas.com/ctx/supportform/createForm

art297
Opal | Level 21

I agree with Cynthia that it is (over)time to submit your problem to Tech Support.  I've seen similar complaints that repeated macro variable assignments appear to eat up memory, even when they are overwriting the same macro variable, but no one (that I'm aware of) has posted data and code that replicates the problem.

If that is indeed what is happening, we would all like to know and know how to correct the situation.

Please let us know what Tech Support has to say.

jklein_271
Calcite | Level 5

Just an update on this old thread.  I ran the multi day process again using the same code but with SAS 9.4 and I did not experience this same problem.  It appears the "memory leak" that I was experiencing is no longer.  The memory was flat through the entire process and was at an expected idle memory level after the process completed.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 14 replies
  • 2023 views
  • 0 likes
  • 9 in conversation