02-01-2016 07:39 AM
I am building a large simulation for power analysis, and I am stuck in what you may consider a relatively simple part of it.
I have a SAS code (long) which eventually creates a dataset. I want to do this 1000 times, when in each iteration a different dataset will be created. In each iteration, I wish to run a mixed model, and to save one of the model's p-values, along with some additional prameters. My mixed model code is:
ods exclude all; ods output tests3=mpval Diffs=mdif; proc mixed data=data; class group(ref='control') ID visit; model CFB = group visit baseline group*visit; repeated visit / subject=id type=cs; lsmeans group*visit / diff=all cl; by parameter; run; ods output close; ods exclude none;
This code runs the mixed model and saves the p-values in a dataset called mdif. In my next piece of code I filter out the p-values I do not need and keep only the one of interest (by doing where visit = x). Note that I have two dependent variables of interest, and so this runs twice, thus I get two p-values.
The next time I run this code, in the next iteration, the dataset mdif will be overrided. I am not sure how to save all the p-values, of course along with the parameter being used, and the sample size (my main do loop should contain some values of N). Can you please assist me building the loop of the simulation?
Thank you in advance !
02-01-2016 07:55 AM
How are you looping?
Look into proc append to store results.
proc append base=pvalues data=mpval; run;
Make sure to clear the data at the beginning of your process otherwise you'll append between runs of the macro.
Proc Append does not require that the base dataset originally exist.
02-01-2016 08:09 AM - edited 02-01-2016 08:23 AM
the more I think about it, I find that the looping itself is a big problem, bigger than anticipated. I do not have the loop yet. What I have is a code that generates an initial dataset (should not be in the loop), let's call this data RAW. Then I have a macro code I wrote that does something to RAW, and yields a new dataset call DATA. The process of creating DATA should be in the loop. I don't know how to loop, as when I did power analysis before, I used a dataset in which there was a loop. Now on the other hand, I kind of want the loop to be outside the creation of the data, not inside. To simplify, I start with RAW, then I want in each iteration to do: %ActionA, %ActionB, %ActionC, which will evetually create a dataset called DATA. I need a loop to wrap the 3 actions, to run a mixed model on the outcome DATA, and to store the p-values.
02-01-2016 08:32 AM
Looping over PROC and DATA steps 1000 times will be rather slow.
You should create a data set with 1000 replications of whatever sampled or simulated data you need. Then you can run PROC MIXED BY replicate; and the rest of the calcuations will be similar BY replicate.
Show example data and process for one replicate.
02-01-2016 08:43 AM
I am not sure how to write tables here, so I'll list my variables instead. I start with a dataset RAW. Then, I have 3 macros, one of them is fairly complicated, that uses RAW, and eventually creates a data called DATA (or any other name). The macros are like black boxes.
The outcome of the macros is a dataset with the following variables:
the dataset DATA will have around 1026 observations (513 for each parameter). In each iteration, DATA will be different, as there is a random component in it's creation. In each iteration, I want to run the mixed model specified above. I will do this 1000 times. For each parameter and for each sample size (which should be defined in the loop) I want the proportion or p-values under 0.05.
02-01-2016 11:53 AM
Something like this where you generate all sample/simulations then analyze them BY REP;
proc print data=sashelp.shoes(obs=20); run; %macro main(rep=5); /*Create samples number of sample is &REP*/ proc surveyselect data = sashelp.shoes method=urs sampsize=50 rep=&rep seed=12345 out=Sample_WR; id _all_; samplingunit region Subsidiary; run; data sample_wr; set sample_wr; do sampleUnitID=1 to numberhits; output; end; run; proc sort data=sample_wr; by Replicate region Subsidiary sampleUnitID; run; data sample_wr; set sample_wr; by Replicate region Subsidiary sampleunitid; if first.Replicate then sampleID = 0; if first.sampleunitid then sampleID + 1; run; /*Analysis BY replicate that is your loop*/ ods select none; ods output tests3=tests3 lsmeans=lsmeans; proc mixed data=sample_wr; by replicate; class region; model sales = inventory region; lsmeans region; run; ods select all; %mend main; %main(rep=3);
02-02-2016 08:47 AM
See this article for guidance on efficient simulation:
The article includes examples and references.