I have written a program to conduct Monte Carlo Simulation and it takes too long to run. I feel like I could use the multithreading feature of SAS (MP CONNECT) to make the code more sufficient. I am not sure how to do it. I am still reading about it and find it not straightforward. I have produced a similar but simplified version of the problem I am struggling with in the code below. I would appreciate your help with (i) advices on wether my feeling is right and (ii) how to efficiently do it (including references of proper documents, etc.).
proc iml;
start program;
use Grunfeld.grunfeld;
read all var {inv} into Y;
read all var {v,k} into x1;
niter = 1000;
ones = j(nrow(x1),1,1);
X = ones||x1;
*run OLS;
bols = solve(X`*X,X`*Y);
* Restrictions;
R1 = j(1,3,0);
R1[2] = 1;
R2 = j(1,3,0);
R2[3] = 1;
seed = 1000;
* function for computing the test statistic g;
start statistic (g, decision, bhat, R, b, b0, x, y,seed);
decision = 0;
y = x*b + normal(j(nrow(x),1,seed));
bhat = solve(x`*x, x`*y);
vcovbhat = inv((x`*x));
start InvMat(InvM, M);
call SVD(Ustar, qstar, Vstar, M);
InvM = Vstar*ginv(diag(qstar))*Ustar;
finish InvMat;
v = R*vcovbhat*R`;
run InvMat(invv, v);
g = (R*bhat-b0)`*invv*(R*bhat-b0);
asympt_crit_value = 3.841455338;
if g > asympt_crit_value then decision = 1;
finish statistic;
* Monte carlo simulation p-value;
start MC(RR, g, niter, decision, bhat, R, bols, b0, X, Y,seed);
Rej = j(niter,1,0);
do i = 1 to niter;
run statistic(g, decision, bhat, R, bols, b0, X, Y,seed);
Rej[i] = decision;
end;
RR = Rej[:,];
Finish MC;
run MC(RR1, g1, niter, decision, bhat, R1, bols, bols[2], X, Y,seed);
run MC(RR2, g2, niter, decision, bhat, R2, bols, bols[3], X, Y,seed);
print bols;
print R1;
print R2;
print niter RR1 RR2;
finish;
run program;
quit;
run;
Thank you,
Mantobaye
MP Connect is not per definition multi-threaded, it's more like multi-process.
I don't know about Monte Caro, nor IML.
But to utilize MP Connect, you need to be able to split up your program in steps that can run in parallel.
So, if you are able to split up your IML in several IML calls (I have no idea how to do that), then it's worth a try. There's plenty of examples in the documentation to help you get started.
If you can't split upp your PROC IML, you may need to look for means of running that more efficiently itself.
OPTIONS FULLSTIMER; will tell you little about resources consumed. Do you run this on a SAS server? Perhaps you could get ore resources by increasing MEMSIZE....?
How many are too many?
It's up to you have nay pieces you will create.
If your box has 8 cores, you may want to create 4-8 cores, depending on how PROC IML itself utilizes threads.
Perhaps you need to develop some kind of parametrized macro logic to do the splitting in reasonable many sub processes in MP Connect.
LinusH,
In this serial format, I have used 3 different data sets , each subdivided in 5 sub-data sets with different sizes. These are used to do Monte Carlo simulation and bootstrap experiments. I have tried to split the code in 3 pieces for one special data, but they are running serially. But I would like to avoid running the common parts many times, if possible, to optimize the use of resources.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.