Fluorite | Level 6

## Dynamically select variables on which compute the mean

Hi,

I have a macro which creates a dataset which size depends on the input :

Columns names : a_1, a_2, a_3, a_4 ... .

I want to compute the mean of all the columns but the first two in the macro :

``````DATA want;
SET have;
mean_value=mean(a_3, a_4 ....);
RUN;``````

How can I dynamically select the columns ? (I know all the columns' name)

Maybe with columns' index/position (Mean on the 3rd column to the last one) ?

Gluttony

1 ACCEPTED SOLUTION

Accepted Solutions
Diamond | Level 26

## Re: Dynamically select variables on which compute the mean

Well, if the variables of interest all begin with a_ then you can use something like this:

``mean_value = mean(of a_3-a_19);``

--
Paige Miller
7 REPLIES 7
Diamond | Level 26

## Re: Dynamically select variables on which compute the mean

Well, if the variables of interest all begin with a_ then you can use something like this:

``mean_value = mean(of a_3-a_19);``

--
Paige Miller
Fluorite | Level 6

Thank you !!!!
That's perfect !
Gluttony
Super User

## Re: Dynamically select variables on which compute the mean

Is your issue that you do not know how many a_ variables there are and need help determining that?

Will you guarantee that there will always be 3 or more values?

Then perhaps:

```data want;
set have;
array aa a_: ;
mean_value = (sum (of aa(*)) - sum(a_1,a_2))/(n (of aa(*)) - 2);
run;```

but you would need to check how many values are in the AA array if you can't be sure there are always 3 or more.

Diamond | Level 26

## Re: Dynamically select variables on which compute the mean

@ballardw wrote:

Is your issue that you do not know how many a_ variables there are and need help determining that?

Will you guarantee that there will always be 3 or more values?

Then perhaps:

```data want;
set have;
array aa a_: ;
mean_value = (sum (of aa(*)) - sum(a_1,a_2))/(n (of aa(*)) - 2);
run;```

but you would need to check how many values are in the AA array if you can't be sure there are always 3 or more.

Doesn't work if a_1 and/or a_2 is missing, the denominator will be wrong.

--
Paige Miller
Super User

## Re: Dynamically select variables on which compute the mean

@PaigeMiller wrote:

@ballardw wrote:

Is your issue that you do not know how many a_ variables there are and need help determining that?

Will you guarantee that there will always be 3 or more values?

Then perhaps:

```data want;
set have;
array aa a_: ;
mean_value = (sum (of aa(*)) - sum(a_1,a_2))/(n (of aa(*)) - 2);
run;```

but you would need to check how many values are in the AA array if you can't be sure there are always 3 or more.

Doesn't work if a_1 and/or a_2 is missing, the denominator will be wrong.

Yep.

Incomplete data description => incomplete response.

Fluorite | Level 6

## Re: Dynamically select variables on which compute the mean

In my case, the columns were :

a_1, a_2, b_1, b_2, b_3 .... b_n

and I wanted to compute the mean on the "b" columns,

So @PaigeMiller's trick (before the edit) about using :

``mean(of b:)``

is perfect.
Thank you,

Gluttony

Diamond | Level 26

## Re: Dynamically select variables on which compute the mean

as long as there are no other variables whose name starts with B in your data set, that would work. For example, if your data set also contains variable BirthYear then the B: in the MEAN function won't work.

--
Paige Miller
Discussion stats
• 7 replies
• 611 views
• 2 likes
• 3 in conversation