turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- General Programming
- /
- parallel programming with MP CONNECT

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-11-2015 09:56 PM

I have written a program to conduct Monte Carlo Simulation and it takes too long to run. I feel like I could use the multithreading feature of SAS (MP CONNECT) to make the code more sufficient. I am not sure how to do it. I am still reading about it and find it not straightforward. I have produced a similar but simplified version of the problem I am struggling with in the code below. I would appreciate your help with (i) advices on wether my feeling is right and (ii) how to efficiently do it (including references of proper documents, etc.).

proc iml;

start program;

use Grunfeld.grunfeld;

read all var {inv} into Y;

read all var {v,k} into x1;

niter = 1000;

ones = j(nrow(x1),1,1);

X = ones||x1;

*run OLS;

bols = solve(X`*X,X`*Y);

* Restrictions;

R1 = j(1,3,0);

R1[2] = 1;

R2 = j(1,3,0);

R2[3] = 1;

seed = 1000;

* function for computing the test statistic g;

start statistic (g, decision, bhat, R, b, b0, x, y,seed);

decision = 0;

y = x*b + normal(j(nrow(x),1,seed));

bhat = solve(x`*x, x`*y);

vcovbhat = inv((x`*x));

start InvMat(InvM, M);

call SVD(Ustar, qstar, Vstar, M);

InvM = Vstar*ginv(diag(qstar))*Ustar;

finish InvMat;

v = R*vcovbhat*R`;

run InvMat(invv, v);

g = (R*bhat-b0)`*invv*(R*bhat-b0);

asympt_crit_value = 3.841455338;

if g > asympt_crit_value then decision = 1;

finish statistic;

* Monte carlo simulation p-value;

start MC(RR, g, niter, decision, bhat, R, bols, b0, X, Y,seed);

Rej = j(niter,1,0);

do i = 1 to niter;

run statistic(g, decision, bhat, R, bols, b0, X, Y,seed);

Rej[i] = decision;

end;

RR = Rej[:,];

Finish MC;

run MC(RR1, g1, niter, decision, bhat, R1, bols, bols[2], X, Y,seed);

run MC(RR2, g2, niter, decision, bhat, R2, bols, bols[3], X, Y,seed);

print bols;

print R1;

print R2;

print niter RR1 RR2;

finish;

run program;

quit;

run;

Thank you,

Mantobaye

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-12-2015 02:10 AM

MP Connect is not per definition multi-threaded, it's more like multi-process.

I don't know about Monte Caro, nor IML.

But to utilize MP Connect, you need to be able to split up your program in steps that can run in parallel.

So, if you are able to split up your IML in several IML calls (I have no idea how to do that), then it's worth a try. There's plenty of examples in the documentation to help you get started.

If you can't split upp your PROC IML, you may need to look for means of running that more efficiently itself.

OPTIONS FULLSTIMER; will tell you little about resources consumed. Do you run this on a SAS server? Perhaps you could get ore resources by increasing MEMSIZE....?

Data never sleeps

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-12-2015 03:19 AM

Thank you LinusH for your comments.

The 2 run lines before the prints are completely independent. This said, these lines that actually conduct the Monte Carlo simulation could could be split and each could run independently with the rest of the program. The result will be the same as when they are run together as in the code provided.

However, if I decide to split code into pieces, I will end up with two many pieces - in my real case.

The 2 run lines before the prints are completely independent. This said, these lines that actually conduct the Monte Carlo simulation could could be split and each could run independently with the rest of the program. The result will be the same as when they are run together as in the code provided.

However, if I decide to split code into pieces, I will end up with two many pieces - in my real case.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-12-2015 09:48 AM

How many are too many?

It's up to you have nay pieces you will create.

If your box has 8 cores, you may want to create 4-8 cores, depending on how PROC IML itself utilizes threads.

Perhaps you need to develop some kind of parametrized macro logic to do the splitting in reasonable many sub processes in MP Connect.

Data never sleeps

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-12-2015 04:56 PM

LinusH,

In this serial format, I have used 3 different data sets , each subdivided in 5 sub-data sets with different sizes. These are used to do Monte Carlo simulation and bootstrap experiments. I have tried to split the code in 3 pieces for one special data, but they are running serially. But I would like to avoid running the common parts many times, if possible, to optimize the use of resources.