So let's say I have a macro that loops through 2 iterations, and at the end of each iteration my output is of the form:
Iteration 1 output:
data Example;
input Variable $ Var1 Var2 Var3;
datalines;
Mean 195 163 100
Max 220 198 240
;
run;
Iteration 2 output:
data Example2;
input Variable $ Var1 Var2 Var3;
datalines;
Mean 25 125 500
Max 210 141 630
;
run;
Inside, my macro looks like this:
PROC datasets lib = work nolist; DELETE Example;
%MACRO output_table(y);
/*
bla bla bla insert code
*/
PROC MEANS DATA = Example;
var Var1 Var2 Var3;
output out = Example&y;
QUIT;
PROC append base = Example data = Example&y;
%MEND output_table;
%MACRO iter_1_2;
%Do y = 1 %TO 2;
%output_table(&y);
%End;
%MEND iter_1_2;
RUN;
My end result is just the tables concatenated, for instance:
data Example_append;
input Variable $ Var1 Var2 Var3;
datalines;
Mean 195 163 100
Max 220 198 240
Mean 25 125 500
Max 210 141 630
;
run;
But what I really want is the summarized values of all the variables from all the tables. So in this case:
For the Mean of Var1-Var3, I'd have: (195+25)/2 = 110, (163+125)/2=144, (100+500)/2 = 300
and for max, I'd have: max(220,210)=220, max(198,141)=198, max(240,630)=630 and my output would look like:
data Example_append;
input Variable $ Var1 Var2 Var3;
datalines;
Mean 110 144 300
Max 220 198 630
;
run;
I'm unsure what I'm doing wrong and how to get there. Is it a PROC means step on my appended data, after a transpose? I can't transpose them because of the repeat Vars in the rows.
Any help is appreciated. Apologies for any typos.
Yes that's correct. My first loop comes out correctly. I understand the PROC means output also includes N, MIN, MAX, MEAN, STDEV, but I omitted it for simplicity.
The gist of what I am trying to do is once the loop is completed, the final table is summarized for each variable, within the same row. So if every iteration outputs a "Max" and "Mean" row, with the columns as Var1-Var3, then the #rows = 2 x #loops (1 loops = 2 rows, 2 loops = 4 rows etc), I instead want it so that after all my loops I only have 2 rows of data, regardless of loops. All the data are summarized in those two rows.
I'm not following, how are you planning to collapse those rows? You haven't shown any desired output.
Are you then trying to take the average of averages? That's generally not a good idea at all and you haven't specified the logic of how you'd want to collapse them.
If you wanted to do average of averages, I would recommend factoring in the size of each group at least since a group with N=5 will have a different distrubiton than N=100 and considering them equivalent would be unfair.
proc means data=appended MEAN;
class STAT;
var var1-var3;
output out=want mean=;
run;
@UniversitySas wrote:
The gist of what I am trying to do is once the loop is completed, the final table is summarized for each variable, within the same row. So if every iteration outputs a "Max" and "Mean" row, with the columns as Var1-Var3, then the #rows = 2 x #loops (1 loops = 2 rows, 2 loops = 4 rows etc), I instead want it so that after all my loops I only have 2 rows of data, regardless of loops. All the data are summarized in those two rows.
I was hoping to get an output like this:
data Wanted_output;
input Variable $ Var1 Var2 Var3;
datalines;
Mean 110 144 300
Max 220 198 630
;
run;
Where the mean corresponds to the means across all the different loops, similarly the max is now the max of all the different groups.
Is PROC Append not ideal for solving this?
@UniversitySas wrote:
I was hoping to get an output like this:
data Wanted_output; input Variable $ Var1 Var2 Var3; datalines; Mean 110 144 300 Max 220 198 630 ; run;
Where the mean corresponds to the means across all the different loops, similarly the max is now the max of all the different groups.
Is PROC Append not ideal for solving this?
Why not a loop like this pseudo-code:
%do i = 1 %to 2
PROC APPEND
%end
PROC MEANS
This PROC MEANS gets you the final output you said you wanted, mean of all groups, max of all groups
@Reeza makes a good point. At the start of this thread, @UniversitySas, you were very clear that you wanted the mean of the means; but now you seem to be talking about the mean of the data, which is not necessarily the same thing. So we need a lot more clarity here; and could you please discuss the calculations needed without confusing the issue by talking about some looping algorithm, as the algorithm in code should be thought of independently from the desired results.
I have an example of appending with reporting here, including the PROC MEANS portion.
https://gist.github.com/statgeek/353374a5d8ea4f0c89ce5d80a47f4a4c
Just an update - I managed to solve my problem.
The issue in my code was here:
PROC MEANS DATA = Example;
var Var1 Var2 Var3;
output out = Example&y;
QUIT;
What I did instead was:
PROC datasets ; DELETE temp;
%MACRO output_table(y);
/*
bla bla bla insert code
*/
PROC MEANS DATA = Example;
var Var1 Var2 Var3;
output out = Example;
QUIT;
PROC append base = temp data = Example;
%MEND output_table;
%MACRO iter_1_2;
%Do y = 1 %TO 2;
%output_table(&y);
%End;
%MEND iter_1_2;
%iter_1_2;
PROC MEANS DATA = temp;
var Var1 Var2 Var3;
RUN;
This seems to work!
Thanks again for all your help, sorry for the confusion in what I was trying to do.
Now let me say I don't think looping is really necessary, and that you shouldn't automatically decide that a loopong solution is necessary. It seems to me that you could have used BY statements to accomplish this, and there would be no need for looping or PROC APPEND, and the code is a lot easier to get right.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.