I have written a program to conduct Monte Carlo Simulation and it takes too long to run. I feel like I could use the multithreading feature of SAS (MP CONNECT) to make the code more sufficient. I am not sure how to do it. I am still reading about it and find it not straightforward. I have produced a similar but simplified version of the problem I am struggling with in the code below. I would appreciate your help with (i) advices on wether my feeling is right and (ii) how to efficiently do it (including references of proper documents, etc.).
proc iml;
start program;
use Grunfeld.grunfeld;
read all var {inv} into Y;
read all var {v,k} into x1;
niter = 1000;
ones = j(nrow(x1),1,1);
X = ones||x1;
*run OLS;
bols = solve(X`*X,X`*Y);
* Restrictions;
R1 = j(1,3,0);
R1[2] = 1;
R2 = j(1,3,0);
R2[3] = 1;
seed = 1000;
* function for computing the test statistic g;
start statistic (g, decision, bhat, R, b, b0, x, y,seed);
decision = 0;
y = x*b + normal(j(nrow(x),1,seed));
bhat = solve(x`*x, x`*y);
vcovbhat = inv((x`*x));
start InvMat(InvM, M);
call SVD(Ustar, qstar, Vstar, M);
InvM = Vstar*ginv(diag(qstar))*Ustar;
finish InvMat;
v = R*vcovbhat*R`;
run InvMat(invv, v);
g = (R*bhat-b0)`*invv*(R*bhat-b0);
asympt_crit_value = 3.841455338;
if g > asympt_crit_value then decision = 1;
finish statistic;
* Monte carlo simulation p-value;
start MC(RR, g, niter, decision, bhat, R, bols, b0, X, Y,seed);
Rej = j(niter,1,0);
do i = 1 to niter;
run statistic(g, decision, bhat, R, bols, b0, X, Y,seed);
Rej[i] = decision;
end;
RR = Rej[:,];
Finish MC;
run MC(RR1, g1, niter, decision, bhat, R1, bols, bols[2], X, Y,seed);
run MC(RR2, g2, niter, decision, bhat, R2, bols, bols[3], X, Y,seed);
print bols;
print R1;
print R2;
print niter RR1 RR2;
finish;
run program;
quit;
run;
Thank you,
Mantobaye
MP Connect is not per definition multi-threaded, it's more like multi-process.
I don't know about Monte Caro, nor IML.
But to utilize MP Connect, you need to be able to split up your program in steps that can run in parallel.
So, if you are able to split up your IML in several IML calls (I have no idea how to do that), then it's worth a try. There's plenty of examples in the documentation to help you get started.
If you can't split upp your PROC IML, you may need to look for means of running that more efficiently itself.
OPTIONS FULLSTIMER; will tell you little about resources consumed. Do you run this on a SAS server? Perhaps you could get ore resources by increasing MEMSIZE....?
How many are too many?
It's up to you have nay pieces you will create.
If your box has 8 cores, you may want to create 4-8 cores, depending on how PROC IML itself utilizes threads.
Perhaps you need to develop some kind of parametrized macro logic to do the splitting in reasonable many sub processes in MP Connect.
LinusH,
In this serial format, I have used 3 different data sets , each subdivided in 5 sub-data sets with different sizes. These are used to do Monte Carlo simulation and bootstrap experiments. I have tried to split the code in 3 pieces for one special data, but they are running serially. But I would like to avoid running the common parts many times, if possible, to optimize the use of resources.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.