Vectorizing SAS code (exp, log2, max)

StanPyMC3User · Posted 07-23-2020 12:36 PM

I'm fairly new to SAS and am working with a statistician to develop a model using PROC MCMC. It's running extremely slow and I took a look at her code to see if I can help optimize it since I have experience with Stan and pyMC3. The code has quite a few do loops, e.g.,

do i = 1 to 3000;
    result[i] = exp(a[1] + a[2] * log2(x[i]));
end;

My main experience is with Stan and pyMC3. In both of those packages many functions are vectorized. That is, the function takes a vector or matrix rather than having to operate elementwise. For example, in Stan, we could replace the do loop to get the following code:

result = exp(a[1] + a[2] * log2(x));

Avoiding loops and using the vectorized form of `exp` and `log2` in Stan greatly speeds up programming. Is this also true for SAS? If so, how would one vectorize the above code? It looks like from the documentation that log2 and exp take only scalar (i.e., floating point values) and cannot work with arrays.

PaigeMiller · Posted 07-23-2020 12:45 PM

You can vectorize these functions using PROC IML. Whether or not that helps with PROC MCMC, I don't know.

--
Paige Miller

Reeza · Posted 07-23-2020 12:49 PM

3000 variables? That's a lot of variables to be working with so I'm not surprised at all it's slow.
How many rows? If you want suggestions for efficiency and can post the whole code I'm sure you'll get suggestions as well.

I don't think a do loop is the inefficiency here - SAS typically processes data line by line but not 100$ sure how it would do it within MCMC. You could also explore forcing the data into memory instead which may speed things up.

SASfile statement
https://documentation.sas.com/?docsetId=lestmtsglobal&docsetTarget=n0osyhi338pfaan1plin9ioilduk.htm&...

ballardw · Posted 07-23-2020 01:03 PM

One question might be why there are 3000 separate variables in the first place.

How long is "extremely slow" taking and how many observations and variables are in the data set?

smantha · Posted 07-24-2020 01:34 AM

Can you try using a do over loop rather than a simple do loop?

smantha · Posted 07-24-2020 01:37 AM

Do over x;
Result = exp(a[1]+a[2]*x)
End;

PaigeMiller · Posted 07-24-2020 06:43 AM

@smantha wrote:
Do over x;
Result = exp(a[1]+a[2]*x)
End;

DO OVER is no longer supported by SAS 9.4, as far as I can see in the documentation. It may still work, but there could possibly be situations where it doesn't work properly.

--
Paige Miller

Ksharp · Posted 07-24-2020 07:51 AM

Calling @Rick_SAS

Vectorizing SAS code (exp, log2, max)

Re: Vectorizing SAS code (exp, log2, max)

Re: Vectorizing SAS code (exp, log2, max)

Re: Vectorizing SAS code (exp, log2, max)

Re: Vectorizing SAS code (exp, log2, max)

Re: Vectorizing SAS code (exp, log2, max)

Re: Vectorizing SAS code (exp, log2, max)

Re: Vectorizing SAS code (exp, log2, max)

Registration is open

SAS Training: Just a Click Away