Many SAS procedures have NTHREADS option that allows one to fine-tune computer resource allocation and can have dramatic impact on processing times. As far as I know GLIMMIX is not one of them.
In running bootstrap simulations, I am running the same GLIMMIX model 500 times and, with approximately 4 minutes per bootstrap Replication, processing time ends up 2,000 minutes or well over one day.
Given that computer resource utilization is minimal during the whole run, I tried to split the 500 bootstrap samples into 10 parallel runs with 50 Replications in each. To my surprise, when running exactly the same GLIMMIX model with 10 parallel processes, the per-replication processing times increased from 4 minutes to about 18 minutes. Computer resource utilization did not show anything remotely close to reaching the limits of the powerful Windows machine I am using (64 core CPU, 128GB or RAM, NvME Hard Discs).
In summary, by parallelizing the process, I cut the total run time in half but that's not anywhere near close to 10 times corresponding to 10 parallel runs of the models.
Can anyone help me understand why this is?
More importantly, how do I optimize the workflow. That is, would 50 parallel runs be faster than 10 or would 5 be better? I can, of course, just try but the process is very time-consuming.
Thanks for your thoughts.
... View more