hello SAS community,
I attach here a code I am using to run a simulation. The code has several steps, and the final data table mean2 has the average of a quantity per decile of another variable.
I would like to run each simulation N times. That is, for each set of values for the "starting" variables nobs, nvars, etc (which I change manually), I would like to run the code N times, stack all the resulting files mean2_1, mean2_2,...,mean2_N and take averages of the variable of interest for each decile.
For example, if the file that includes all the simulations is called mean2_N with N rows of data, the data file I want should be
proc means data=mean_N noprint;
var _0 _1 _2 _3 _4 _5 _6 _7 _8 _9;
output out= WANT mean(_0)=A_0 etc
run;
Thank you all very much!
It sounds like you may want to modify this data step:
data sim0; array x_ {&nvars}; do j=1 to &nobs; do i=1 to dim(x_); x_(i)= &mean + &vol*rannor(3452083); /*https://online.stat.psu.edu/stat482/book/export/html/663*/ s=rand('Uniform'); end; output; end; drop i j; run;
To add a number of runs:
data sim0; array x_ {&nvars}; Do run= 1 to 10; /* or yet another macro variable*/ do j=1 to &nobs; do i=1 to dim(x_); x_(i)= &mean + &vol*rannor(3452083); /*https://online.stat.psu.edu/stat482/book/export/html/663*/ s=rand('Uniform'); end; output; end; end; /* of the Run loop*/ drop i j; run;
Then include a BY RUN in all of the following steps. For any of your code that is already using By processing make RUN the first by variable.
Then the final proc Means would have BY Run as well and get a summary for each of the Runs.
AFTER that if want analysis ACROSS the runs then add another Proc Means/summary/ whatever analysis that does not use the By Run.
It sounds like you may want to modify this data step:
data sim0; array x_ {&nvars}; do j=1 to &nobs; do i=1 to dim(x_); x_(i)= &mean + &vol*rannor(3452083); /*https://online.stat.psu.edu/stat482/book/export/html/663*/ s=rand('Uniform'); end; output; end; drop i j; run;
To add a number of runs:
data sim0; array x_ {&nvars}; Do run= 1 to 10; /* or yet another macro variable*/ do j=1 to &nobs; do i=1 to dim(x_); x_(i)= &mean + &vol*rannor(3452083); /*https://online.stat.psu.edu/stat482/book/export/html/663*/ s=rand('Uniform'); end; output; end; end; /* of the Run loop*/ drop i j; run;
Then include a BY RUN in all of the following steps. For any of your code that is already using By processing make RUN the first by variable.
Then the final proc Means would have BY Run as well and get a summary for each of the Runs.
AFTER that if want analysis ACROSS the runs then add another Proc Means/summary/ whatever analysis that does not use the By Run.
Thank you!
This works well.
The only part I am not sure of is whether the retain command works the same way after I include 2 by's.
Before I had:
data sim6;
set sim5;
by id;
retain y;
if first.id then do;
y=&s;
end;
else do;
shares=y;
y= y*(1-prop_sold);
end;
drop shares;
rename y=shares;
run;
Now I have:
data sim6;
set sim5;
by run id;
retain y;
if first.id then do;
y=&s;
end;
else do;
shares=y;
y= y*(1-prop_sold);
end;
drop shares;
rename y=shares;
run;
Does this piece of the code work the same way with two by's?
Thank you again!
Retain works exactly the same.
The question would be the First. and/or last. variables when used for resetting the Retained values The separate by variables each have there own first and last. So while within a RUN group you still have first and last for the Id variable as needed.
Considerations might arise if you sort data though.
A very useful exercise is to create a data set then process that with 2 or 3 by variables and create variable to capture the first and last values so you can examine them in a data set. Dummy code of an example of the second bit.
data examine; set have; by a b c; Firsta = First.A; Firstb = First.B; Firstc = First.C; Lasta = Last.A; Lastb = Last.B; Lastc = Last.C; run;
I wouldn't make the Have set big having 3 or 4 values of each variable should be sufficient. Make sure to different combinations. This requires sorted data of course for BY processing.
Addition learning from using not sorted values and add the option NOTSORTED to the BY statement.
Thanks, I'll try that. In this case I think I am OK, as the dataset used with retain and first is sorted already, so the first observation in each run id group is the one I want.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.