Hi,
I have a macro which creates a dataset which size depends on the input :
Columns names : a_1, a_2, a_3, a_4 ... .
I want to compute the mean of all the columns but the first two in the macro :
DATA want;
SET have;
mean_value=mean(a_3, a_4 ....);
RUN;
How can I dynamically select the columns ? (I know all the columns' name)
Maybe with columns' index/position (Mean on the 3rd column to the last one) ?
Gluttony
Well, if the variables of interest all begin with a_ then you can use something like this:
mean_value = mean(of a_3-a_19);
Well, if the variables of interest all begin with a_ then you can use something like this:
mean_value = mean(of a_3-a_19);
Is your issue that you do not know how many a_ variables there are and need help determining that?
Will you guarantee that there will always be 3 or more values?
Then perhaps:
data want; set have; array aa a_: ; mean_value = (sum (of aa(*)) - sum(a_1,a_2))/(n (of aa(*)) - 2); run;
but you would need to check how many values are in the AA array if you can't be sure there are always 3 or more.
@ballardw wrote:
Is your issue that you do not know how many a_ variables there are and need help determining that?
Will you guarantee that there will always be 3 or more values?
Then perhaps:
data want; set have; array aa a_: ; mean_value = (sum (of aa(*)) - sum(a_1,a_2))/(n (of aa(*)) - 2); run;but you would need to check how many values are in the AA array if you can't be sure there are always 3 or more.
Doesn't work if a_1 and/or a_2 is missing, the denominator will be wrong.
@PaigeMiller wrote:
@ballardw wrote:
Is your issue that you do not know how many a_ variables there are and need help determining that?
Will you guarantee that there will always be 3 or more values?
Then perhaps:
data want; set have; array aa a_: ; mean_value = (sum (of aa(*)) - sum(a_1,a_2))/(n (of aa(*)) - 2); run;but you would need to check how many values are in the AA array if you can't be sure there are always 3 or more.
Doesn't work if a_1 and/or a_2 is missing, the denominator will be wrong.
Yep.
Incomplete data description => incomplete response.
In my case, the columns were :
a_1, a_2, b_1, b_2, b_3 .... b_n
and I wanted to compute the mean on the "b" columns,
So @PaigeMiller's trick (before the edit) about using :
mean(of b:)
is perfect.
Thank you,
Gluttony
as long as there are no other variables whose name starts with B in your data set, that would work. For example, if your data set also contains variable BirthYear then the B: in the MEAN function won't work.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.