- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
hello SAS community,
I attach here a code I am using to run a simulation. The code has several steps, and the final data table mean2 has the average of a quantity per decile of another variable.
I would like to run each simulation N times. That is, for each set of values for the "starting" variables nobs, nvars, etc (which I change manually), I would like to run the code N times, stack all the resulting files mean2_1, mean2_2,...,mean2_N and take averages of the variable of interest for each decile.
For example, if the file that includes all the simulations is called mean2_N with N rows of data, the data file I want should be
proc means data=mean_N noprint;
var _0 _1 _2 _3 _4 _5 _6 _7 _8 _9;
output out= WANT mean(_0)=A_0 etc
run;
Thank you all very much!
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
It sounds like you may want to modify this data step:
data sim0; array x_ {&nvars}; do j=1 to &nobs; do i=1 to dim(x_); x_(i)= &mean + &vol*rannor(3452083); /*https://online.stat.psu.edu/stat482/book/export/html/663*/ s=rand('Uniform'); end; output; end; drop i j; run;
To add a number of runs:
data sim0; array x_ {&nvars}; Do run= 1 to 10; /* or yet another macro variable*/ do j=1 to &nobs; do i=1 to dim(x_); x_(i)= &mean + &vol*rannor(3452083); /*https://online.stat.psu.edu/stat482/book/export/html/663*/ s=rand('Uniform'); end; output; end; end; /* of the Run loop*/ drop i j; run;
Then include a BY RUN in all of the following steps. For any of your code that is already using By processing make RUN the first by variable.
Then the final proc Means would have BY Run as well and get a summary for each of the Runs.
AFTER that if want analysis ACROSS the runs then add another Proc Means/summary/ whatever analysis that does not use the By Run.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
It sounds like you may want to modify this data step:
data sim0; array x_ {&nvars}; do j=1 to &nobs; do i=1 to dim(x_); x_(i)= &mean + &vol*rannor(3452083); /*https://online.stat.psu.edu/stat482/book/export/html/663*/ s=rand('Uniform'); end; output; end; drop i j; run;
To add a number of runs:
data sim0; array x_ {&nvars}; Do run= 1 to 10; /* or yet another macro variable*/ do j=1 to &nobs; do i=1 to dim(x_); x_(i)= &mean + &vol*rannor(3452083); /*https://online.stat.psu.edu/stat482/book/export/html/663*/ s=rand('Uniform'); end; output; end; end; /* of the Run loop*/ drop i j; run;
Then include a BY RUN in all of the following steps. For any of your code that is already using By processing make RUN the first by variable.
Then the final proc Means would have BY Run as well and get a summary for each of the Runs.
AFTER that if want analysis ACROSS the runs then add another Proc Means/summary/ whatever analysis that does not use the By Run.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you!
This works well.
The only part I am not sure of is whether the retain command works the same way after I include 2 by's.
Before I had:
data sim6;
set sim5;
by id;
retain y;
if first.id then do;
y=&s;
end;
else do;
shares=y;
y= y*(1-prop_sold);
end;
drop shares;
rename y=shares;
run;
Now I have:
data sim6;
set sim5;
by run id;
retain y;
if first.id then do;
y=&s;
end;
else do;
shares=y;
y= y*(1-prop_sold);
end;
drop shares;
rename y=shares;
run;
Does this piece of the code work the same way with two by's?
Thank you again!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Retain works exactly the same.
The question would be the First. and/or last. variables when used for resetting the Retained values The separate by variables each have there own first and last. So while within a RUN group you still have first and last for the Id variable as needed.
Considerations might arise if you sort data though.
A very useful exercise is to create a data set then process that with 2 or 3 by variables and create variable to capture the first and last values so you can examine them in a data set. Dummy code of an example of the second bit.
data examine; set have; by a b c; Firsta = First.A; Firstb = First.B; Firstc = First.C; Lasta = Last.A; Lastb = Last.B; Lastc = Last.C; run;
I wouldn't make the Have set big having 3 or 4 values of each variable should be sufficient. Make sure to different combinations. This requires sorted data of course for BY processing.
Addition learning from using not sorted values and add the option NOTSORTED to the BY statement.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks, I'll try that. In this case I think I am OK, as the dataset used with retain and first is sorted already, so the first observation in each run id group is the one I want.