Hi everyone.
My SAS version is 9.4 and I have a dataset with 128967 observations, for each obs i have a date variable (dd/mm/yyyy) and 12 months of amounts (IMPO_1 -- IMPO_12). Here is a subset of my dataset:
Oss | DATA_DEC | IMPO_1 | IMPO_2 | IMPO_3 | IMPO_4 | IMPO_5 | IMPO_6 | IMPO_7 | IMPO_8 | IMPO_9 | IMPO_10 | IMPO_11 | IMPO_12 |
1 | 15/01/2018 | 549,78 | 547,07 | 544,71 | 542,36 | 539,84 | 537,32 | 534,8 | 532,28 | 530,75 | 529,45 | 527,86 | . |
2 | 25/05/2009 | 3.707,41 | 3.701,80 | 3.695,14 | 3.689,64 | 3.719,15 | 3.748,20 | 3.742,56 | 3.737,02 | 3.725,50 | 3.742,90 | 3.731,60 | 3.726,50 |
3 | 10/07/2009 | 307,2 | 306,4 | 301,5 | 310,5 | 312,23 | 309,45 | 305,12 | 301,23 | 307,25 | 289,56 | 286,25 | 215,9 |
4 | 15/04/2014 | 53,35 | 53,06 | 51,07 | 51,35 | 57,35 | 55,01 | 51,26 | . | . | . | . | . |
5 | 05/03/2009 | 121,16 | 119,48 | 114 | 100,89 | 85,82 | 76,43 | 82,31 | 76,71 | 76,25 | 74,69 | 75,14 | 73,12 |
For each observation I need to calculate the average of a variable number of 'IMPO_', this number of 'IMPO_' depends on month(DATA_DEC).
For example:
- for the first obs the MONTH(DATA_DEC)=1 so AVERAGE=MEAN(IMPO_1);
- for the second obs the MONTH(DATA_DEC)=5 so AVERAGE=MEAN(OF IMPO_1 -- IMPO_5);
- for the third obs the MONTH(DATA_DEC)=7 so AVERAGE=MEAN(OF IMPO_1 -- IMPO_7);
.... and so on for each of 128967 observations.
My wanted dataset needs to be like:
Oss | DATA_DEC | IMPO_1 | IMPO_2 | IMPO_3 | IMPO_4 | IMPO_5 | IMPO_6 | IMPO_7 | IMPO_8 | IMPO_9 | IMPO_10 | IMPO_11 | IMPO_12 | AVERAGE |
1 | 15/01/2018 | 549,78 | 547,07 | 544,71 | 542,36 | 539,84 | 537,32 | 534,8 | 532,28 | 530,75 | 529,45 | 527,86 | . | 549,78 |
2 | 25/05/2009 | 3.707,41 | 3.701,80 | 3.695,14 | 3.689,64 | 3.719,15 | 3.748,20 | 3.742,56 | 3.737,02 | 3.725,50 | 3.742,90 | 3.731,60 | 3.726,50 | 3.702,63 |
3 | 10/07/2009 | 307,2 | 306,4 | 301,5 | 310,5 | 312,23 | 309,45 | 305,12 | 301,23 | 307,25 | 289,56 | 286,25 | 215,9 | 307,49 |
4 | 15/04/2014 | 53,35 | 53,06 | 51,07 | 51,35 | 57,35 | 55,01 | 51,26 | . | . | . | . | . | 52,21 |
5 | 05/03/2009 | 121,16 | 119,48 | 114 | 100,89 | 85,82 | 76,43 | 82,31 | 76,71 | 76,25 | 74,69 | 75,14 | 73,12 | 118,21 |
How can i do this? I've tried creating a macro variable for each row, which holds the value of the month such as &month1, &month2... then referencing it in to the MEAN(OF IMPO_1 -- IMPO_&&&month&n) but for 128967 rows this is madness.
Could someone help me please!!!
Thank you!
data want;
set have;
array t impo_1-impo_12;
do i=1 to dim(t) until(i=month(data_dec));
sum=sum(sum,t(i));
end;
avg=sum/i;
drop i;
run;
/*Or just*/
data want;
set have;
array t impo_1-impo_12;
do i=1 to month(data_dec);
sum=sum(sum,t(i));
end;
avg=sum/i;
drop i;
run;
data want;
set have;
array t impo_1-impo_12;
do i=1 to dim(t) until(i=month(data_dec));
sum=sum(sum,t(i));
end;
avg=sum/i;
drop i;
run;
/*Or just*/
data want;
set have;
array t impo_1-impo_12;
do i=1 to month(data_dec);
sum=sum(sum,t(i));
end;
avg=sum/i;
drop i;
run;
thank you so much!
the idea of using arrays is awsome! i introduced some code changes to manage missing values.
data want;
set have;
array t impo_1-impo_12;
do i=1 to month(data_dec);
sum=sum(sum, t(i));
if t(i) <>. then
den=i;
avg=round(sum/den,0.01);
end;
drop i sum den;
run;
and it works perfectly! thank you so much!
Assuming that your data_dec is a SAS date value:
data want; set have; select (month(data_dec)); when (1) average= mean(impo_1); when (2) average= mean(of impo_1 -impo_2); when (3) average= mean(of impo_1 -impo_3); when (4) average= mean(of impo_1 -impo_4); when (5) average= mean(of impo_1 -impo_5); when (6) average= mean(of impo_1 -impo_6); when (7) average= mean(of impo_1 -impo_7); when (8) average= mean(of impo_1 -impo_8); when (9) average= mean(of impo_1 -impo_9); when (10) average= mean(of impo_1 -impo_10); when (11) average= mean(of impo_1 -impo_11); when (12) average= mean(of impo_1 -impo_12); otherwise; end; run;
The Select statement in this case evaluates an expression month(data_dec) and branches to the WHEN value that matches.
I would be careful of using the two dash list indicator in case anything changes the order of your data. You could end up with other values in the calculations. When you know that have sequentially named values it is better to use the single dash for the list.
thank you!! it works great! thank you also for the sugestions regarding the two dash list indicator.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.