BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mahele
Calcite | Level 5

Hi everyone.
My SAS version is 9.4 and I have a dataset with 128967 observations, for each obs i have a date variable (dd/mm/yyyy)  and 12 months of amounts (IMPO_1 -- IMPO_12). Here is a subset of my dataset:

OssDATA_DECIMPO_1IMPO_2IMPO_3IMPO_4IMPO_5IMPO_6IMPO_7IMPO_8IMPO_9IMPO_10IMPO_11IMPO_12
115/01/2018549,78547,07544,71542,36539,84537,32534,8532,28530,75529,45527,86.
225/05/20093.707,413.701,803.695,143.689,643.719,153.748,203.742,563.737,023.725,503.742,903.731,603.726,50
310/07/2009307,2306,4301,5310,5312,23309,45305,12301,23307,25289,56286,25215,9
415/04/201453,3553,0651,0751,3557,3555,0151,26.....
505/03/2009121,16119,48114100,8985,8276,4382,3176,7176,2574,6975,1473,12

 

For each observation I need to calculate the average of a variable number of 'IMPO_', this number of 'IMPO_' depends on month(DATA_DEC).

 

For example:

- for the first obs the         MONTH(DATA_DEC)=1 so AVERAGE=MEAN(IMPO_1);

- for the second obs the   MONTH(DATA_DEC)=5 so AVERAGE=MEAN(OF IMPO_1 -- IMPO_5);

- for the third obs the        MONTH(DATA_DEC)=7 so AVERAGE=MEAN(OF IMPO_1 -- IMPO_7);

.... and so on for each of 128967 observations.

 

My wanted dataset needs to be like:

 

OssDATA_DECIMPO_1IMPO_2IMPO_3IMPO_4IMPO_5IMPO_6IMPO_7IMPO_8IMPO_9IMPO_10IMPO_11IMPO_12AVERAGE
115/01/2018549,78547,07544,71542,36539,84537,32534,8532,28530,75529,45527,86.            549,78
225/05/20093.707,413.701,803.695,143.689,643.719,153.748,203.742,563.737,023.725,503.742,903.731,603.726,50        3.702,63
310/07/2009307,2306,4301,5310,5312,23309,45305,12301,23307,25289,56286,25215,9            307,49
415/04/201453,3553,0651,0751,3557,3555,0151,26.....              52,21
505/03/2009121,16119,48114100,8985,8276,4382,3176,7176,2574,6975,1473,12            118,21

 

How can i do this? I've tried creating a macro variable for each row, which holds the value of the month such as &month1, &month2...   then referencing it in to the MEAN(OF IMPO_1 -- IMPO_&&&month&n) but for 128967 rows this is madness.

Could someone help me please!!!
Thank you!

 

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20
data want;
 set have;
array t impo_1-impo_12;
 do i=1 to dim(t) until(i=month(data_dec));
 sum=sum(sum,t(i));
end;
avg=sum/i;
drop i;
run;

/*Or just*/

 


data want;
 set have;
array t impo_1-impo_12;
 do i=1 to month(data_dec);
 sum=sum(sum,t(i));
end;
avg=sum/i;
drop i;
run;

 

View solution in original post

5 REPLIES 5
novinosrin
Tourmaline | Level 20
data want;
 set have;
array t impo_1-impo_12;
 do i=1 to dim(t) until(i=month(data_dec));
 sum=sum(sum,t(i));
end;
avg=sum/i;
drop i;
run;

/*Or just*/

 


data want;
 set have;
array t impo_1-impo_12;
 do i=1 to month(data_dec);
 sum=sum(sum,t(i));
end;
avg=sum/i;
drop i;
run;

 

mahele
Calcite | Level 5

thank you so much!
the idea of using arrays is awsome! i introduced some  code changes  to manage  missing values.

 

data want;
	set have;
	array t impo_1-impo_12;

	do i=1 to month(data_dec);
		sum=sum(sum, t(i));

		if t(i) <>. then
			den=i;
		avg=round(sum/den,0.01);
	end;
	drop i sum den;
run;

and it works perfectly! thank you so much!

 

Astounding
PROC Star
Good idea to use arrays, but ....

The two variations look like they should generate different answers. The denominator in the bottom solution is wrong.

Also, it may be important to consider whether the data might contain missing values for some of the months. Again, the denominator would be incorrect if that happens.
ballardw
Super User

Assuming that your data_dec is a SAS date value:

data want;
 set have;
 select (month(data_dec));
   when (1) average= mean(impo_1);
   when (2) average= mean(of impo_1 -impo_2);
   when (3) average= mean(of impo_1 -impo_3);
   when (4) average= mean(of impo_1 -impo_4);
   when (5) average= mean(of impo_1 -impo_5);
   when (6) average= mean(of impo_1 -impo_6);
   when (7) average= mean(of impo_1 -impo_7);
   when (8) average= mean(of impo_1 -impo_8);
   when (9) average= mean(of impo_1 -impo_9);
   when (10) average= mean(of impo_1 -impo_10);
   when (11) average= mean(of impo_1 -impo_11);
   when (12) average= mean(of impo_1 -impo_12);
   otherwise;
 end;
run;

The Select statement in this case evaluates an expression month(data_dec) and branches to the WHEN value that matches.

 

I would be careful of using the two dash list indicator in case anything changes the order of your data. You could end up with other values in the calculations. When you know that have sequentially named values it is better to use the single dash for the list.

mahele
Calcite | Level 5

thank you!! it works great! thank you also for the sugestions regarding the two dash list indicator.

sas-innovate-white.png

Our biggest data and AI event of the year.

Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.

Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1205 views
  • 0 likes
  • 4 in conversation