Greetings!
I'd like to sum a set of variables, the set of which are identified by a variable in the dataset. I realize there are many ways to do this, but ultimately I'll be processing large output from PROC MCMC where array statements are unwieldy.
In the data set 'test' I identify the variable list to be summed by the first column labeled 'x', so that I will compute a_1+a_2+a_3 in the first row, then b_1+b_2+b_3 in the second row, etc.
data in;
input x $ a_1 a_2 a_3 b_1 b_2 b_3 c_1 c_2 c_3;
datalines;
a 1 1 1 2 2 2 3 3 3
b 1 1 1 2 2 2 3 3 3
c 1 1 1 2 2 2 3 3 3
;
run;
The macro %vTest writes out the summation:
%macro vtest(xVal);
0 %do i = 1 %to 3; +&xVal._&i. %end;
%mend vtest;
data set 'test_data' reads dataset 'in' and processes using the RESOLVE function and %vtest. Results are then printed.
data test_data;
set in;
rc = resolve(cats('%vtest(',x,')'));
run;
options nocenter;
proc print data= test_data noobs;
run;
As you can see, the column rc shows 0+a_1+a_2+a_3. I'd like rc to show the actual sum, i.e. 3, 6, and 9.
Many thanks in advance!
Garnett
Not sure what the macro has to do with the problem. But if you want to generate code then first generate the code to a file and then use %INCLUDE to make it part of your data step.
proc sort data=in(keep=x) out=xlist nodupkey;
by x ;
run;
filename code temp;
data _null_;
set xlist end=eof;
file code ;
if _n_=1 then put 'select (x);' ;
put 'when (' x :$quote. ') rc=%vtest(' x ');' ;
if eof then put 'end;' ;
run;
options mprint;
data want ;
set in ;
%include code / source2 ;
run;
Result.
1271 options mprint; 1272 data want ; 1273 set in ; 1274 %include code / source2 ; NOTE: %INCLUDE (level 1) file CODE is file C:\Users\...\#LN00069. 1275 +select (x); 1276 +when ("a" ) rc=%vtest(a ); MPRINT(VTEST): 0 +a_1 +a_2 +a_3 1277 +when ("b" ) rc=%vtest(b ); MPRINT(VTEST): 0 +b_1 +b_2 +b_3 1278 +when ("c" ) rc=%vtest(c ); MPRINT(VTEST): 0 +c_1 +c_2 +c_3 1279 +end; NOTE: %INCLUDE (level 1) ending. 1280 run; NOTE: There were 3 observations read from the data set WORK.IN. NOTE: The data set WORK.WANT has 3 observations and 11 variables.
Well this certainly gets complicated, and I hope you are right that you can't use ARRAYS in your real-world situation, but I am skeptical.
However, you seem to have over-complicated things even in the macro case. I believe this is what you want:
data test_data; set in; rc = %vtest(a); run;
,
Thanks very much for looking at this! Your solution just resolves to a_1+a_2+a_3 for each row of the dataset. I'd like the summation over variables that can be defined by column x of the input dataset.
It's not clear how you plan to define the rules and function - that's important.If you can provide some more details we can provide better options.
Wouldn't IML maybe be a better approach in this case?
Thanks for looking at this, Reeza.
IML might be the approach I take. I'd like to find a datastep + macro solution if possible.
@gnet wrote:
Thanks for looking at this, Reeza.
IML might be the approach I take. I'd like to find a datastep + macro solution if possible.
It might help to describe exactly what you are attempting. Summing within "columns", variables in SAS terminology, is usually done in either a summary procedure such as Proc Means or Summary. Summing multiple variables often done in data step or Proc SQL. Depending on needs Proc Report maybe able to do both.
I will say that your example data is poor as you could sums of 3, either within observation ("row") by adding a_1+a_2+a_3 OR adding the three rows for any of a_1, a_2 or a_3.
Example:
proc means data=in; var _numeric_; output out=summary (drop=_:) sum=; run;
Will sum each variable in the example data set.
If then need to sum the variables of a, b and c groups:
data want; set summary; suma= sum(of a:); sumb= sum(of b:); sumc= sum(of c:); keep sum: ; run;
That proc means and data step could be turned into a macro if needed but seeing an actual use case with data that lets us know which way and specific sum should go might help.
If you want macro code to add numbers you need to pass the numbers to the macro environment. It will know nothing of the data step variable A_1, let alone what value it has for this iteration of the data step.
How many possible values of X are there?
data want ;
set in ;
select (x);
when ('a') result = sum(of a_:);
when ('b') result = sum(of b_:);
when ('c') result = sum(of c_:);
end;
run;
Thanks Tom!
There may be thousands of values of 'x', but more importantly the macro %vtest may be an arbitrary function and not just a sum. I used the sum in the example to make it easier to present.
Thanks again,
Garnett
You could;
1. Transpose
2. extract letter prefix from _name_ variable, this forms a group
3. look up with hash or any look up that you are comfortable with and execute sum
@gnet Try this, if you still can't find it, I can show a demo using your sample piece shortly after my class
You could follow Tom's example but create a loop to go through all your values of x then write whichever generic function you use instead of sum.
**create array with distinct values of X;
proc sql noprint;
select distinct(x) into :ValuesOfX separated by ' '
from in;
%let NumValuesOfX = &SqlObs.;
**Generate test_data;
%macro test;
data test_data;
set in;
select(x);
%do i=1 %to &NumValuesOfX.;
%let XVal=%scan(&ValuesOfX., &i.);
when ("&XVal.") rc= sum(of &XVal._:);
%end;
end;
run;
proc print data= test_data noobs;
run;
%mend;
%test;
@gnet wrote:
Thanks Tom!
There may be thousands of values of 'x', but more importantly the macro %vtest may be an arbitrary function and not just a sum. I used the sum in the example to make it easier to present.
Thanks again,
Garnett
"Thousands of values of 'X' " has the potential to have issues related to "thousands of macro variables" and having to adjust macro memory settings and other issues.
Actually "sum" using +&xVal._&I construct is a very poor approach because if any single variable is missing then the result is missing, unless that is your desired behavior.
Perhaps you really want to use Proc FCMP to make your function which does allow passing an array of values and returning a single value
You could follow Tom's example and create a loop to go over all your X values and then substitute the sum with any generic fucntion you are using.
**create array with distinct values of X;
proc sql noprint;
select distinct(x) into :ValuesOfX separated by ' '
from in;
%let NumValuesOfX = &SqlObs.;
**Generate test_data;
%macro test;
data test_data;
set in;
select(x);
%do i=1 %to &NumValuesOfX.;
%let XVal=%scan(&ValuesOfX., &i.);
when ("&XVal.") rc= sum(of &XVal._:);
%end;
end;
run;
proc print data= test_data noobs;
run;
%mend;
%test;
Here is the demo, if you can play with this approach, you can bid good bye to the macro processor 🙂
data in;
input x $ a_1 a_2 a_3 b_1 b_2 b_3 c_1 c_2 c_3;
datalines;
a 1 1 1 2 2 2 3 3 3
b 1 1 1 2 2 2 3 3 3
c 1 1 1 2 2 2 3 3 3
;
run;
proc transpose data=in out=w;
id x;
run;
data want;
set w;
k=char(_name_,1);
run;
data final_want;
if _n_=1 then do;
if 0 then set want(keep=a k);
dcl hash h(dataset:'want(keep=a k)', multidata: 'y', ordered: 'y');
h.definekey('k');
h.definedata(all:'yes');
h.definedone();
end;
set in(keep=x);
sum=0;
do while(h.do_over(key:x) eq 0);
sum+a;
end;
output;
run;
Hi there,
This is a fascinating bit of code, and involves hash programming that I've never used. Looks like I have work to do!
Thanks again!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.