BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mahele
Calcite | Level 5

Hi everyone.
My SAS version is 9.4 and I have a dataset with 128967 observations, for each obs i have a date variable (dd/mm/yyyy)  and 12 months of amounts (IMPO_1 -- IMPO_12). Here is a subset of my dataset:

OssDATA_DECIMPO_1IMPO_2IMPO_3IMPO_4IMPO_5IMPO_6IMPO_7IMPO_8IMPO_9IMPO_10IMPO_11IMPO_12
115/01/2018549,78547,07544,71542,36539,84537,32534,8532,28530,75529,45527,86.
225/05/20093.707,413.701,803.695,143.689,643.719,153.748,203.742,563.737,023.725,503.742,903.731,603.726,50
310/07/2009307,2306,4301,5310,5312,23309,45305,12301,23307,25289,56286,25215,9
415/04/201453,3553,0651,0751,3557,3555,0151,26.....
505/03/2009121,16119,48114100,8985,8276,4382,3176,7176,2574,6975,1473,12

 

For each observation I need to calculate the average of a variable number of 'IMPO_', this number of 'IMPO_' depends on month(DATA_DEC).

 

For example:

- for the first obs the         MONTH(DATA_DEC)=1 so AVERAGE=MEAN(IMPO_1);

- for the second obs the   MONTH(DATA_DEC)=5 so AVERAGE=MEAN(OF IMPO_1 -- IMPO_5);

- for the third obs the        MONTH(DATA_DEC)=7 so AVERAGE=MEAN(OF IMPO_1 -- IMPO_7);

.... and so on for each of 128967 observations.

 

My wanted dataset needs to be like:

 

OssDATA_DECIMPO_1IMPO_2IMPO_3IMPO_4IMPO_5IMPO_6IMPO_7IMPO_8IMPO_9IMPO_10IMPO_11IMPO_12AVERAGE
115/01/2018549,78547,07544,71542,36539,84537,32534,8532,28530,75529,45527,86.            549,78
225/05/20093.707,413.701,803.695,143.689,643.719,153.748,203.742,563.737,023.725,503.742,903.731,603.726,50        3.702,63
310/07/2009307,2306,4301,5310,5312,23309,45305,12301,23307,25289,56286,25215,9            307,49
415/04/201453,3553,0651,0751,3557,3555,0151,26.....              52,21
505/03/2009121,16119,48114100,8985,8276,4382,3176,7176,2574,6975,1473,12            118,21

 

How can i do this? I've tried creating a macro variable for each row, which holds the value of the month such as &month1, &month2...   then referencing it in to the MEAN(OF IMPO_1 -- IMPO_&&&month&n) but for 128967 rows this is madness.

Could someone help me please!!!
Thank you!

 

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20
data want;
 set have;
array t impo_1-impo_12;
 do i=1 to dim(t) until(i=month(data_dec));
 sum=sum(sum,t(i));
end;
avg=sum/i;
drop i;
run;

/*Or just*/

 


data want;
 set have;
array t impo_1-impo_12;
 do i=1 to month(data_dec);
 sum=sum(sum,t(i));
end;
avg=sum/i;
drop i;
run;

 

View solution in original post

5 REPLIES 5
novinosrin
Tourmaline | Level 20
data want;
 set have;
array t impo_1-impo_12;
 do i=1 to dim(t) until(i=month(data_dec));
 sum=sum(sum,t(i));
end;
avg=sum/i;
drop i;
run;

/*Or just*/

 


data want;
 set have;
array t impo_1-impo_12;
 do i=1 to month(data_dec);
 sum=sum(sum,t(i));
end;
avg=sum/i;
drop i;
run;

 

mahele
Calcite | Level 5

thank you so much!
the idea of using arrays is awsome! i introduced some  code changes  to manage  missing values.

 

data want;
	set have;
	array t impo_1-impo_12;

	do i=1 to month(data_dec);
		sum=sum(sum, t(i));

		if t(i) <>. then
			den=i;
		avg=round(sum/den,0.01);
	end;
	drop i sum den;
run;

and it works perfectly! thank you so much!

 

Astounding
PROC Star
Good idea to use arrays, but ....

The two variations look like they should generate different answers. The denominator in the bottom solution is wrong.

Also, it may be important to consider whether the data might contain missing values for some of the months. Again, the denominator would be incorrect if that happens.
ballardw
Super User

Assuming that your data_dec is a SAS date value:

data want;
 set have;
 select (month(data_dec));
   when (1) average= mean(impo_1);
   when (2) average= mean(of impo_1 -impo_2);
   when (3) average= mean(of impo_1 -impo_3);
   when (4) average= mean(of impo_1 -impo_4);
   when (5) average= mean(of impo_1 -impo_5);
   when (6) average= mean(of impo_1 -impo_6);
   when (7) average= mean(of impo_1 -impo_7);
   when (8) average= mean(of impo_1 -impo_8);
   when (9) average= mean(of impo_1 -impo_9);
   when (10) average= mean(of impo_1 -impo_10);
   when (11) average= mean(of impo_1 -impo_11);
   when (12) average= mean(of impo_1 -impo_12);
   otherwise;
 end;
run;

The Select statement in this case evaluates an expression month(data_dec) and branches to the WHEN value that matches.

 

I would be careful of using the two dash list indicator in case anything changes the order of your data. You could end up with other values in the calculations. When you know that have sequentially named values it is better to use the single dash for the list.

mahele
Calcite | Level 5

thank you!! it works great! thank you also for the sugestions regarding the two dash list indicator.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 732 views
  • 0 likes
  • 4 in conversation