SAS in Unix environment, parallel computing problem

Reply
Occasional Contributor
Posts: 6

SAS in Unix environment, parallel computing problem

Hi everyone,

I hope that somebody would help me. I am working with SAS in Unix environment through a remote server, and in order to make my simulations faster, I have to run in parallel my program. For doing so, I have five parameters that change into 243 possible combinations, which are storage in a csv file (the one that the server reads in order to change those values in my code).

What I am performing are two models in meta-analysis using PROC MIXED. My problem basically is that when I try to put all the output together, with the following code, I just recover the simulations corresponding to the last combinations of factors.

libname r " ";

ods listing close;

ods noresults;

options nonotes;

/*Reading the parameters from the csv file*/

%let varw = %scan(&sysparm,1,':');

%let varv = %scan(&sysparm,2,':');

%let varr = %scan(&sysparm,3,':');

%let varu = %scan(&sysparm,4,':');

%let smd = %scan(&sysparm,5,':');

/*macro for generating the data and performing the four-level meta-analysis*/

%macro fourlevel;

%let k=3; /*number of studies*/

%let c = 30; /*number of countries*/

%let oc=2; /*number of outcomes*/

%let dataset = 10;

%do d=1 %to &dataset; /*number of simulated datasets for each combination*/

/* simulating sample level residuals (level-1)*/

data sample;

do country = 1 to &c;

  do study = 1 to &k;

  do outcome = 1 to &oc;

  r = normal(0);

  output;

  end;

  end;

end;

/* simulating outcome level residuals (level-2)*/

data outcome;

do country = 1 to &c;

  do study = 1 to &k;

  do outcome = 1 to &oc;

  u = normal(0);

  output;

  end;

  end;

end;

/*simulating random study-effects (level 3)*/

data study;

do country = 1 to &c;

  do study = 1 to &k;

  v = normal(0);

  output;

  end;

end;

/*simulating random country-effects (level 4)*/

data country;

do country = 1 to &c;

  w = normal(0);

  output;

end;

data all;

  merge country study;

  by country;

data all;

  merge all outcome;

  by country study ;

data all;

  merge all sample;

  by country study outcome;

data all;

set all;

  w = w*sqrt(&varw);

  v = v*sqrt(&varv);

  u = u*sqrt(&varu);

  r = r*sqrt(&varr); 

  d = &smd + w + v + u + r;

  precision=1/↕

  /*keep country study outcome d precision;*/

run;

/*performing the meta-analysis*/

/*four level*/

proc mixed data= all ;

  class country study outcome;

  weight precision;

  model d = /solution;

  random intercept / sub = outcome (study*country);

  random intercept /sub = study(country);

  random intercept /sub = country;

  parms 1 1 1 1/hold = 4;

  ods output solutionF=fixed1 covparms=random1;

run;

%do out = 1 %to 1;

  data fixed&out;

  set fixed&out;

  model = &out;

  k = &k;

  d = &d;

  oc = &oc;

  varw=&varw;

  varv=&varv;

  varr=↕

  varu=&varu;

  smd=&smd;

  proc append base=fixedall&out new=fixed&out FORCE NOWARN;

  quit;

  data random&out;

  set random&out;

  model = &out;

  k = &k;

  d = &d;

  oc = &oc;

  varw=&varw;

  varv=&varv;

  varr=↕

  varu=&varu;

  smd=&smd;

  proc append base=randomall&out new=random&out;

  quit;

%end;

%end;

%mend fourlevel;

%fourlevel;

quit;

data r.fixedall1;

set fixedall1;

run;

data r.randomall1;

set randomall1;

run;

Thanks in advance for any help

SAS Employee
Posts: 340

Re: SAS in Unix environment, parallel computing problem

The code you posted does not run in parallel.

Is there a wrapper code, that runs this 243 times?


Perhaps include one more parameter in the &sysparm vector.

Then you will have one aditional line:

%let combinationNum = %scan(&sysparm,6,':');


And change the last 2 data steps to:

data r.fixedall&combinationNum.;

set fixedall1;

run;

data r.randomall&combinationNum.;

set randomall1;

run;

At the very end you have to asseble the 243 results with 2 additional data steps:

data r.fixedall_final;

     set r.fixedall:;

run;

data r.randomall_final;

     set r.randomall:;

run;

Occasional Contributor
Posts: 6

Re: SAS in Unix environment, parallel computing problem

Thank you Gergely for your answer,

I have tried your recommendations, I added a new parameter just for giving a number to the combination of factors. However, in my final output, this does not assemble the 243 results, instead, a lot of them. I mean, many combinations are repeated and many others are missing (for instance I got combinations 5, 10, 16, and so on, and the combinations at the beginning at between those are missing, and for these combinations I got several repeated results).

Do you know what is happening?

(answering your question about that is not parallel, I don't have too much experience in that kind of programming; in this case am working through a server of a supercomputer which according to their manuals, when one send the job with the csv file containing the parameters, the cluster splices the job in parallel).

Waiting for your help

Mauricio

SAS Employee
Posts: 340

Re: SAS in Unix environment, parallel computing problem

So your CSV file looks now something like this:

1:2:3:4:5:1

0.1:0.22:1.3:2.4:1.5:2

...

0.1:0.22:1.3:2.4:1.5:243

And when you run this program through the supercomputer, you have the following datasets:

fixedall_final1, fixedall_final2, ..., fixedall_final243,

randomall_final1, randomall_final2, ..., randomall_final243,

And the problem is, that for example the content of fixedall_final1, fixedall_final2, fixedall_final3, fixedall_final4, fixedall_final5 is the same?


First of all double check your CSV file.


The last 2 data steps I wrote should be not included into your program!  You should run them separately, after running the 243 scenarios!


Is the libname statment exactly this in your program? (So you have not changed it in the post to hide the path?)

libname r " ";


I think the system runs your 243 tasks in paralell, but allways 5 or 6 instances at a time. And somehow the results overwrite eachother.

Does the supercomputer system manual contain something about how to write the code? Maybe restrictions about the usage of libraries? About usage of WORK libraries?

If still not working, you could try to include the &combinationNum. suffix to all the dataset names that you use.


Occasional Contributor
Posts: 6

Re: SAS in Unix environment, parallel computing problem

Dear Gergely,

It have been a long time since my last update. I've been working more in my code, and just until now I realized that actually it is running in parallel in the HPC cluster, however the problem lies on the fact that when it runs, the final data set is sometimes in use by one process and for that reason I lose several rows in it

The error messages that I got in my log file look like the following:

ERROR: A lock is not available for WORK.FIXEDALL1.DATA.

ERROR: Lock held by process 37767.

I tried to tackle this problem using a macro as described here: http://www.lexjansen.com/pharmasug/2005/posters/po33.pdf

However, it is still giving me incomplete outputs, now I wonder if maybe using a data set stored in RAM memory instead of the slave core could help me out. I have a little notion that using PROC IML could help me, but so far I don't have a clue of how to do it

Super User
Super User
Posts: 6,390

Re: SAS in Unix environment, parallel computing problem

Are your trying to have the separate parallel processes write back to the same FIXEDALL dataset?

You probable should be having each thread write its own summary file and then combine them after all of the threads are finished.

If you want to have them write to a common table then it will need to be a table that supports concurrent access such as a database system or SAS/Share server.

Occasional Contributor
Posts: 6

Re: SAS in Unix environment, parallel computing problem

Indeed, I just did what you suggested about having each thread in its own summary, and it finally worked. Thank you very much for your help!

Ask a Question
Discussion stats
  • 6 replies
  • 390 views
  • 6 likes
  • 3 in conversation