BookmarkSubscribeRSS Feed
Haris
Lapis Lazuli | Level 10

Many SAS procedures have NTHREADS option that allows one to fine-tune computer resource allocation and can have dramatic impact on processing times.  As far as I know GLIMMIX is not one of them.

 

In running bootstrap simulations, I am running the same GLIMMIX model 500 times and, with approximately 4 minutes per bootstrap Replication, processing time ends up 2,000 minutes or well over one day.

 

Given that computer resource utilization is minimal during the whole run, I tried to split the 500 bootstrap samples into 10 parallel runs with 50 Replications in each.  To my surprise, when running exactly the same GLIMMIX model with 10 parallel processes, the per-replication processing times increased from 4 minutes to about 18 minutes.  Computer resource utilization did not show anything remotely close to reaching the limits of the powerful Windows machine I am using (64 core CPU, 128GB or RAM, NvME Hard Discs).

 

In summary, by parallelizing the process, I cut the total run time in half but that's not anywhere near close to 10 times corresponding to 10 parallel runs of the models.

 

Can anyone help me understand why this is?

 

More importantly, how do I optimize the workflow.  That is, would 50 parallel runs be faster than 10 or would 5 be better?  I can, of course, just try but the process is very time-consuming.

 

Thanks for your thoughts.

5 REPLIES 5
JOL
SAS Employee JOL
SAS Employee

Try re-posting under Analytics -> Statistical Procedures

SASKiwi
PROC Star

Compare your SAS log real and CPU time notes for GLIMMIX. If real time is much greater than CPU time you can be pretty much certain that your jobs are IO-bound. I suspect that a resource-intensive procedure like GLIMMIX uses a lot of WORK utility processing. Monitoring what your WORK folders do could identify this.

Haris
Lapis Lazuli | Level 10

Thanks for the recommendation regardgin WORK.  I am looking at Windows Resource Monitor and don't see anything of note on CPU, RAM, or Disk I/O.  Maybe up to 10-15% utilization.  Anything else you can recommend?

Rick_SAS
SAS Super FREQ

When you say that you are running "10 parallel processes," how are you actually doing that? Can you post the code you are submitting? For example, are you submitting 10 sas jobs from a linux command line, or doing something else?

Haris
Lapis Lazuli | Level 10
Sorry, should have been more specific: I am running SAS on Windows 11 desktop. The ten programms running GLIMMIX models are submitted by right-clicking and "Batch Submit with SAS 9.4". The code of GLIMMIX is probably not relevant but here it is just in case:

proc glimmix data=BootData Method=Quad;
where SampleID LE 50;
by SampleID;
freq NumberHits;
class Stratum; * pre-sorted ;
class ageGrp / reference = last ;
model OpMort (event="1 = Yes") = &ModelVars / dist=binary link=logit cl;
random intercept / subject= Stratum;
output out=out.B0 pred(iLink BLUP)=PredProb_boot
pred( noBLUP)=PredProb_boot_noBLUP_xBeta
;
run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 550 views
  • 2 likes
  • 4 in conversation