BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
texasmfp
Lapis Lazuli | Level 10

First, a big thanks to the SAS community that recently helped get a prog back up and running.  I have run into two issues related to run time.  These both involve the now-running program: a simple regression analysis that is looped for x amount of iterations. 

 

First, i am getting different run times using the same data and program.  For example, run #1 takes 1 hour 44 minutes while Run #2 takes 1 hour 21 minutes.  No changes to data or program and no other applications were open.  Simply ran it and then hit run again.  One expects repeatable times: bizarre.

 

Second, and the more important issue, I was getting stable, actually slightly decreasing, run times in terms of time/iteration loop, until the iterations got to be large, then, the time/iteration doubles:

 

Total Seconds                        4                      58                    627                4,866              83,734
seconds/iteration0.1142857140.0974789920.0957983190.0929335370.257935139
iterations                      35                    595                6,545              52,360           324,632

 

I found this note (http://support.sas.com/kb/57/630.html) and changed the dataset names to be different, which significantly improved the time per iteration, across the board, but that pesky doubling at 324K iterations remains. 

 

Total Seconds                        4                      47                    513                4,521              49,066
seconds/iteration0.1142857140.0789915970.0783804430.0863445380.151143449
iterations                      35                    595                6,545              52,360           324,632

 

I think the ballooning lag relates to the .lck and .dat writing as the prog runs.  The temporary .lck and intermediate data files are in the 160 MB range near the end of the iterations. 

 

I also thought it could be printing some notes or results, but I thought I turned off the log and results so it doesn't waste time or clog.  I would appreciate an explanation of why run time/iteration is ballooning or suggestions to make the code more efficient.  

1 ACCEPTED SOLUTION

Accepted Solutions
SASKiwi
PROC Star

Looks like you are running this program interactively. Since it takes over an hour to run you would be better off running this in batch mode. This will remove any timing variations caused by your SAS client interface.

 

Also you don't say if you are running this on a local PC or on a remote SAS Server. Even on a PC you will get timing variations caused by whatever else you are using it for at the time and if your SAS program is reading remote data across a network this will cause more variations.

View solution in original post

4 REPLIES 4
mkeintz
PROC Star

I don't know that this would explain the unexpected run time per iteration, but my question is why are you making single-use data set FILES instead of data set VIEWS.   It appears that  TARGETONLY, TARGET2ONLY, TARGET3ONLY, PMSADJ, TARGETMEANS2, are each created only to append some data to &PMS_data.   So why bother to write these data to disk only to re-read and forget?  It's quite possible that a lot of the time you are using is just the writing out to disk of intermediate data sets.  Using them as views instead will use more memory but potentially could save a lot of clock time.

 

On a more strategic note, would it be possible to avoid iterating, and just create bigger datasets with BY groups corresponding to your macro iterations?

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
texasmfp
Lapis Lazuli | Level 10

Mkeintz:

 

I tested the VIEWs and no change, in fact it made it slightly slower when doing the larger set of iterations.  I watched the temporary sas folder and, those VIEW tables are written to the drive just as the .lck and .sas files are.  So, its not surprising that the run times did not change.

 

However, all of those files only have a single observation.  So, I don't think those files are the problem.  I suspect that the main issue is the &PMS_ data file that is appended with each iteration.

 

proc append FORCE base=&PMS_ data=TARGETMEANS2; 
run; 

The &PMS_ starts small, but gets to be in the 150MB range when I am at the 300,000 iteration level.  That's just my guess.  If so, is there a way to not write out that datafile with each iteration?

 

On the BY group suggestion, I am desparate to seek a faster solution, but I am not sure what you mean by bigger datasets.  The iterations come from 'changelist' value in the testcomb2 dataset.  The program takes the changelist values (a long string), uses that string to select a subset of observations in The REGDATA1 and runs the regression and aggregates the results in the &PMS_ file (after making the regression results jump through a few hoops). Then, it repeats for the next changelist value in the combos2 dataset.  How would I structure that series of steps as you suggested?  Thanks

SASKiwi
PROC Star

Looks like you are running this program interactively. Since it takes over an hour to run you would be better off running this in batch mode. This will remove any timing variations caused by your SAS client interface.

 

Also you don't say if you are running this on a local PC or on a remote SAS Server. Even on a PC you will get timing variations caused by whatever else you are using it for at the time and if your SAS program is reading remote data across a network this will cause more variations.

texasmfp
Lapis Lazuli | Level 10
Thanks. Batch mode definitely shaved run times, especially on longer runs.

hackathon24-white-horiz.png

The 2025 SAS Hackathon Kicks Off on June 11!

Watch the live Hackathon Kickoff to get all the essential information about the SAS Hackathon—including how to join, how to participate, and expert tips for success.

YouTube LinkedIn

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 1027 views
  • 0 likes
  • 3 in conversation