I'm fairly new to SAS and am working with a statistician to develop a model using PROC MCMC. It's running extremely slow and I took a look at her code to see if I can help optimize it since I have experience with Stan and pyMC3. The code has quite a few do loops, e.g.,
do i = 1 to 3000; result[i] = exp(a[1] + a[2] * log2(x[i])); end;
My main experience is with Stan and pyMC3. In both of those packages many functions are vectorized. That is, the function takes a vector or matrix rather than having to operate elementwise. For example, in Stan, we could replace the do loop to get the following code:
result = exp(a[1] + a[2] * log2(x));
Avoiding loops and using the vectorized form of `exp` and `log2` in Stan greatly speeds up programming. Is this also true for SAS? If so, how would one vectorize the above code? It looks like from the documentation that log2 and exp take only scalar (i.e., floating point values) and cannot work with arrays.
You can vectorize these functions using PROC IML. Whether or not that helps with PROC MCMC, I don't know.
One question might be why there are 3000 separate variables in the first place.
How long is "extremely slow" taking and how many observations and variables are in the data set?
@smantha wrote:
Do over x;
Result = exp(a[1]+a[2]*x)
End;
DO OVER is no longer supported by SAS 9.4, as far as I can see in the documentation. It may still work, but there could possibly be situations where it doesn't work properly.
Calling @Rick_SAS
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.