Calcite | Level 5

## AVERAGE OF A VARIABLE NUMBER OF VALUES

Hi everyone.
My SAS version is 9.4 and I have a dataset with 128967 observations, for each obs i have a date variable (dd/mm/yyyy)  and 12 months of amounts (IMPO_1 -- IMPO_12). Here is a subset of my dataset:

 Oss DATA_DEC IMPO_1 IMPO_2 IMPO_3 IMPO_4 IMPO_5 IMPO_6 IMPO_7 IMPO_8 IMPO_9 IMPO_10 IMPO_11 IMPO_12 1 15/01/2018 549,78 547,07 544,71 542,36 539,84 537,32 534,8 532,28 530,75 529,45 527,86 . 2 25/05/2009 3.707,41 3.701,80 3.695,14 3.689,64 3.719,15 3.748,20 3.742,56 3.737,02 3.725,50 3.742,90 3.731,60 3.726,50 3 10/07/2009 307,2 306,4 301,5 310,5 312,23 309,45 305,12 301,23 307,25 289,56 286,25 215,9 4 15/04/2014 53,35 53,06 51,07 51,35 57,35 55,01 51,26 . . . . . 5 05/03/2009 121,16 119,48 114 100,89 85,82 76,43 82,31 76,71 76,25 74,69 75,14 73,12

For each observation I need to calculate the average of a variable number of 'IMPO_', this number of 'IMPO_' depends on month(DATA_DEC).

For example:

- for the first obs the         MONTH(DATA_DEC)=1 so AVERAGE=MEAN(IMPO_1);

- for the second obs the   MONTH(DATA_DEC)=5 so AVERAGE=MEAN(OF IMPO_1 -- IMPO_5);

- for the third obs the        MONTH(DATA_DEC)=7 so AVERAGE=MEAN(OF IMPO_1 -- IMPO_7);

.... and so on for each of 128967 observations.

My wanted dataset needs to be like:

 Oss DATA_DEC IMPO_1 IMPO_2 IMPO_3 IMPO_4 IMPO_5 IMPO_6 IMPO_7 IMPO_8 IMPO_9 IMPO_10 IMPO_11 IMPO_12 AVERAGE 1 15/01/2018 549,78 547,07 544,71 542,36 539,84 537,32 534,8 532,28 530,75 529,45 527,86 . 549,78 2 25/05/2009 3.707,41 3.701,80 3.695,14 3.689,64 3.719,15 3.748,20 3.742,56 3.737,02 3.725,50 3.742,90 3.731,60 3.726,50 3.702,63 3 10/07/2009 307,2 306,4 301,5 310,5 312,23 309,45 305,12 301,23 307,25 289,56 286,25 215,9 307,49 4 15/04/2014 53,35 53,06 51,07 51,35 57,35 55,01 51,26 . . . . . 52,21 5 05/03/2009 121,16 119,48 114 100,89 85,82 76,43 82,31 76,71 76,25 74,69 75,14 73,12 118,21

How can i do this? I've tried creating a macro variable for each row, which holds the value of the month such as &month1, &month2...   then referencing it in to the MEAN(OF IMPO_1 -- IMPO_&&&month&n) but for 128967 rows this is madness.

Thank you!

1 ACCEPTED SOLUTION

Accepted Solutions
Tourmaline | Level 20

## Re: AVERAGE OF A VARIABLE NUMBER OF VALUES

``````data want;
set have;
array t impo_1-impo_12;
do i=1 to dim(t) until(i=month(data_dec));
sum=sum(sum,t(i));
end;
avg=sum/i;
drop i;
run;``````

/*Or just*/

``````
data want;
set have;
array t impo_1-impo_12;
do i=1 to month(data_dec);
sum=sum(sum,t(i));
end;
avg=sum/i;
drop i;
run;``````

5 REPLIES 5
Tourmaline | Level 20

## Re: AVERAGE OF A VARIABLE NUMBER OF VALUES

``````data want;
set have;
array t impo_1-impo_12;
do i=1 to dim(t) until(i=month(data_dec));
sum=sum(sum,t(i));
end;
avg=sum/i;
drop i;
run;``````

/*Or just*/

``````
data want;
set have;
array t impo_1-impo_12;
do i=1 to month(data_dec);
sum=sum(sum,t(i));
end;
avg=sum/i;
drop i;
run;``````

Calcite | Level 5

## Re: AVERAGE OF A VARIABLE NUMBER OF VALUES

thank you so much!
the idea of using arrays is awsome! i introduced some  code changes  to manage  missing values.

``````data want;
set have;
array t impo_1-impo_12;

do i=1 to month(data_dec);
sum=sum(sum, t(i));

if t(i) <>. then
den=i;
avg=round(sum/den,0.01);
end;
drop i sum den;
run;

``````

and it works perfectly! thank you so much!

PROC Star

## Re: AVERAGE OF A VARIABLE NUMBER OF VALUES

Good idea to use arrays, but ....

The two variations look like they should generate different answers. The denominator in the bottom solution is wrong.

Also, it may be important to consider whether the data might contain missing values for some of the months. Again, the denominator would be incorrect if that happens.
Super User

## Re: AVERAGE OF A VARIABLE NUMBER OF VALUES

Assuming that your data_dec is a SAS date value:

```data want;
set have;
select (month(data_dec));
when (1) average= mean(impo_1);
when (2) average= mean(of impo_1 -impo_2);
when (3) average= mean(of impo_1 -impo_3);
when (4) average= mean(of impo_1 -impo_4);
when (5) average= mean(of impo_1 -impo_5);
when (6) average= mean(of impo_1 -impo_6);
when (7) average= mean(of impo_1 -impo_7);
when (8) average= mean(of impo_1 -impo_8);
when (9) average= mean(of impo_1 -impo_9);
when (10) average= mean(of impo_1 -impo_10);
when (11) average= mean(of impo_1 -impo_11);
when (12) average= mean(of impo_1 -impo_12);
otherwise;
end;
run;```

The Select statement in this case evaluates an expression month(data_dec) and branches to the WHEN value that matches.

I would be careful of using the two dash list indicator in case anything changes the order of your data. You could end up with other values in the calculations. When you know that have sequentially named values it is better to use the single dash for the list.

Calcite | Level 5

## Re: AVERAGE OF A VARIABLE NUMBER OF VALUES

thank you!! it works great! thank you also for the sugestions regarding the two dash list indicator.

Discussion stats
• 5 replies
• 572 views
• 0 likes
• 4 in conversation