I need to create sum of 4 variables multiple times each time with new set of variables. For e.g. A1=sum(a1,a2,a3,a4),B1=sum(b1,b2,b3,b4) & so on. So , I am trying to write a macro that will help me do it easily. Following is the code:
Say I have the variables as follows:
A1 A2 A3 A4 B1 B2 B3 B4
1 2 2 4 5 7 8 9
4 6 7 8 6 9 10 11
I need 2 additional columns & my new dataset should look like:
A1 A2 A3 A4 B1 B2 B3 B4 A B
1 2 2 4 5 7 8 9 9 29
4 6 7 8 6 9 10 11 25 36
Can this be done?
%macro SUM2(VAR1,var2,var3,VAR4); data Subs_60_new; set Subs_60; substr(&var1,1,10)=sum(&var1,&var2,&var3,&var4); run; %mend sum2; options mprint mlogic;
%sum2(ADDITIONAL_INFO_Q1,ADDITIONAL_INFO_Q2,ADDITIONAL_INFO_Q3,ADDITIONAL_INFO_Q4);
I
Your problem is a good example of bad data modelling. If you setup your data using long format rather than wide (yes you can, output does not need to match programming use) then your problem becomes moot:
data have; input id add_chng_q1 add_chng_q2 add_chng_q3 add_chng_q4 additional_info_q1 additional_info_q2 additional_info_q3 additional_info_q4; datalines; 1 10 40 55 90 89 70 61 62 2 16 16 22 28 34 40 46 52 ; run; proc transpose data=have out=inter; by id; var add_chng_q1--add_chng_q4; run; proc sql; create table WANT as select ID, _NAME_, sum(COL1) as RESULT from WORK.INTER group by ID, _NAME_; quit;
You can transpose back up if necessary, merge the results back to the data etc. I tend to work with normalised datasets 99% of the time, far easier to work with. If output - either report or data - needs to be transposed, then do that step at the end of the process.
Now, you could do a macro like this:
data have; input add_chng_q1 add_chng_q2 add_chng_q3 add_chng_q4 additional_info_q1 additional_info_q2 additional_info_q3 additional_info_q4; datalines; 10 40 55 90 89 70 61 62 16 16 22 28 34 40 46 52 ; run; options mlogic mprint symbolgen; %macro SumThem (var=); data have; set have; &var.=sum(of &var.:); run; %mend SumThem; %SumThem (var=add_chng_q);
Then you could specify each variable set or you could pull a list out directly from sashelp.vcolumns.
Your probably over thinking the problem, just use arrays or lists of variables:
data have; input A1 A2 A3 A4 B1 B2 B3 B4; datalines; 1 2 2 4 5 7 8 9 4 6 7 8 6 9 10 11 ; run; data want; set have; a=sum(of a:); b=sum(of b:); run;
I think I should have described my data a little more:
ADD_CHNG_Q1 | ADD_CHNG_Q2 | ADD_CHNG_Q3 | ADD_CHNG_Q4 | ADDITIONAL_INFO_Q1 | ADDITIONAL_INFO_Q2 | ADDITIONAL_INFO_Q3 | ADDITIONAL_INFO_Q4 |
10 | 40 | 55 | 90 | 89 | 70 | 61 | 62 |
16 | 16 | 22 | 28 | 34 | 40 | 46 | 52 |
22 | 8 | 11 | 34 | 21 | 10 | 31 | 42 |
28 | 32 | 44 | 96 | 76 | 20 | 16 | 32 |
I have around 800 variables each having a suffix running 1 through 4 & need to sum them .Like Add_Chng_tot=sum(add_chng_q1-add_chng_q4) & so on ...
I could have done the following:
%macro get2(var1,var2,var3,var4,var5);
data get;
set get;
&var5=&var1+&var2+&var3+&var4;
run;
%mend get2;
%get2(ADD_CHNG_Q1,ADD_CHNG_Q2,ADD_CHNG_Q3,ADD_CHNG_Q4,ADD_CHNG_Tot)
But since I have many variables so it is a task to type in the name of the new variable each time (whic is var5 here)
Your problem is a good example of bad data modelling. If you setup your data using long format rather than wide (yes you can, output does not need to match programming use) then your problem becomes moot:
data have; input id add_chng_q1 add_chng_q2 add_chng_q3 add_chng_q4 additional_info_q1 additional_info_q2 additional_info_q3 additional_info_q4; datalines; 1 10 40 55 90 89 70 61 62 2 16 16 22 28 34 40 46 52 ; run; proc transpose data=have out=inter; by id; var add_chng_q1--add_chng_q4; run; proc sql; create table WANT as select ID, _NAME_, sum(COL1) as RESULT from WORK.INTER group by ID, _NAME_; quit;
You can transpose back up if necessary, merge the results back to the data etc. I tend to work with normalised datasets 99% of the time, far easier to work with. If output - either report or data - needs to be transposed, then do that step at the end of the process.
Now, you could do a macro like this:
data have; input add_chng_q1 add_chng_q2 add_chng_q3 add_chng_q4 additional_info_q1 additional_info_q2 additional_info_q3 additional_info_q4; datalines; 10 40 55 90 89 70 61 62 16 16 22 28 34 40 46 52 ; run; options mlogic mprint symbolgen; %macro SumThem (var=); data have; set have; &var.=sum(of &var.:); run; %mend SumThem; %SumThem (var=add_chng_q);
Then you could specify each variable set or you could pull a list out directly from sashelp.vcolumns.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.