seeking a better way to sum variables seperatly by the same group

Reply
Contributor
Posts: 49

seeking a better way to sum variables seperatly by the same group

Hi everyone,

 

I'm having a problem to generate a more efficient code to sum a series of variable of the same type seperately (STMinTenday_grp1-STMinTenday_grp37) by the same group. Below is my original code. I've tried using proc sql combined do loop or data step with array. But neither of my plan worked. Hopefully, someone here could point me to a right direction in doing this. Thanks a lot!

 

proc sql;
   create table temp3.test as
      select *, 
             sum(STMinTenday_grp1)as sum_STMinTenday_grp1, sum(STMinTenday_grp2)as sum_STMinTenday_grp2,
             sum(STMinTenday_grp3)as sum_STMinTenday_grp3, sum(STMinTenday_grp4)as sum_STMinTenday_grp4,
			 sum(STMinTenday_grp5)as sum_STMinTenday_grp5, sum(STMinTenday_grp6)as sum_STMinTenday_grp6,
             sum(STMinTenday_grp7)as sum_STMinTenday_grp7, sum(STMinTenday_grp8)as sum_STMinTenday_grp8,
             sum(STMinTenday_grp9)as sum_STMinTenday_grp9, sum(STMinTenday_grp10)as sum_STMinTenday_grp10,
             sum(STMinTenday_grp11)as sum_STMinTenday_grp11, sum(STMinTenday_grp12)as sum_STMinTenday_grp12,
			 sum(STMinTenday_grp13)as sum_STMinTenday_grp13, sum(STMinTenday_grp14)as sum_STMinTenday_grp14,
             sum(STMinTenday_grp15)as sum_STMinTenday_grp15, sum(STMinTenday_grp16)as sum_STMinTenday_grp16,
             sum(STMinTenday_grp17)as sum_STMinTenday_grp17, sum(STMinTenday_grp18)as sum_STMinTenday_grp18,
             sum(STMinTenday_grp19)as sum_STMinTenday_grp19, sum(STMinTenday_grp20)as sum_STMinTenday_grp20,
			 sum(STMinTenday_grp21)as sum_STMinTenday_grp21, sum(STMinTenday_grp22)as sum_STMinTenday_grp22,
             sum(STMinTenday_grp23)as sum_STMinTenday_grp23, sum(STMinTenday_grp24)as sum_STMinTenday_grp24,
             sum(STMinTenday_grp25)as sum_STMinTenday_grp25, sum(STMinTenday_grp26)as sum_STMinTenday_grp26,
             sum(STMinTenday_grp27)as sum_STMinTenday_grp27, sum(STMinTenday_grp28)as sum_STMinTenday_grp28,
			 sum(STMinTenday_grp29)as sum_STMinTenday_grp29, sum(STMinTenday_grp30)as sum_STMinTenday_grp30,
             sum(STMinTenday_grp31)as sum_STMinTenday_grp31, sum(STMinTenday_grp32)as sum_STMinTenday_grp32,
             sum(STMinTenday_grp33)as sum_STMinTenday_grp33, sum(STMinTenday_grp34)as sum_STMinTenday_grp34,
             sum(STMinTenday_grp35)as sum_STMinTenday_grp35, sum(STMinTenday_grp36)as sum_STMinTenday_grp36,
             sum(STMinTenday_grp37)as sum_STMinTenday_grp37     
	 from temp3.try10day
	    group by bene_id;
quit;
SAS Super FREQ
Posts: 3,416

Re: seeking a better way to sum variables seperatly by the same group

You can use PROC MEANS and put the group variable on a CLASS statement. The output variables will have the variables

GROUP  var1_Sum var2_Sum var3_sum...

 

Here's an example:

 

proc means data=sashelp.class noprint;
class sex;
vars height weight age;
output out=out sum= / autoname;
run;

If you need the group levels to be part of the variable name, you'll have to run an additional step. Let us know if this looks like something you want to pursue.

Contributor
Posts: 49

Re: seeking a better way to sum variables seperatly by the same group

Rick, thank you for your valuable suggestions. Variable name is not an issue here. Would I need to link the resulting aggregate data back to the original data? I still need to keep the variables in the original one for further estimate.
Grand Advisor
Posts: 17,406

Re: seeking a better way to sum variables seperatly by the same group

What's the next step? Maybe it can be incorporated into proc means or the data step solution. 

SAS Super FREQ
Posts: 3,416

Re: seeking a better way to sum variables seperatly by the same group

If you are going to merge the statistics back into the original data set, then I agree that sorting and using a BY statement (and NWAY option) is worthwhile:

 

proc sort data=sashelp.class out=have;
by sex;
run;

proc means data=have noprint nway;
by sex;
vars height weight age;
output out=out sum= / autoname;
run;

data want;
merge have out;
by sex;
run;
Respected Advisor
Posts: 4,995

Re: seeking a better way to sum variables seperatly by the same group

Your post mentions that you tried a DATA step.  In that case, your data must be in sorted order by BENE_ID. 

 

Under those conditions, here is an approach you could try:

 

data want;

array previous {37} STMinTenDay_grp1 - STMinTenDay_grp37;

array totals {37} sum_STMinTenDay_grp1 - sum_STMinTenDay_grp37;

do _n_=1 to 37;

   totals{_n_}=0;

end;

do until (last.bene_id);

   set temp3.try10day;

   by bene_id;

   do _n_=1 to 37;

      totals{_n_} + previous{_n_};

   end;

end;

*** Totals have been calculated, now apply them to each record;

do until (last.bene_id);

   set temp3.try10day;

   by bene_id;

   *** Do some calculations here;

   output;

end;

run;

 

Also note, just in case  you want to stick with PROC SUMMARY ... when your data set is already sorted, the BY statement is often faster than the CLASS statement.  (When you only have a few records per BENE_ID, the reverse can be true making CLASS faster than BY.)  If you do use the CLASS statement, you may need to add the word NWAY to the end of the PROC statement to avoid calculating an extra observation holding grand totals across all values of BENE_ID.

Contributor
Posts: 49

Re: seeking a better way to sum variables seperatly by the same group

Thank you for sharing this code and your thoughts. This is great!
Ask a Question
Discussion stats
  • 6 replies
  • 349 views
  • 4 likes
  • 4 in conversation