DATA Step, Macro, Functions and more

RESOLVE macro function to sum variables in a dataset

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 6
Accepted Solution

RESOLVE macro function to sum variables in a dataset

Greetings!

I'd like to sum a set of variables, the set of which are identified by a variable in the dataset. I realize there are many ways to do this, but ultimately I'll be processing large output from PROC MCMC where array statements are unwieldy.

In the data set 'test' I identify the variable list to be summed by the first column labeled 'x', so that I will compute a_1+a_2+a_3 in the first row, then b_1+b_2+b_3 in the second row, etc.

 

data in;
	input x $ a_1 a_2 a_3 b_1 b_2 b_3 c_1 c_2 c_3;
datalines;
a	1	1	1	2	2	2	3	3	3
b	1	1	1	2	2	2	3	3	3
c	1	1	1	2	2	2	3	3	3
;
run;

 

 

The macro %vTest writes out the summation:

 

%macro vtest(xVal);
0 %do i = 1 %to 3; +&xVal._&i. %end;
%mend vtest;

 

data set 'test_data' reads dataset 'in' and processes using the RESOLVE function and %vtest. Results are then printed.

 

data test_data;
	set in;
   		rc = resolve(cats('%vtest(',x,')')); 
run;

options nocenter;
proc print data= test_data noobs;
run;

As you can see, the column rc shows 0+a_1+a_2+a_3. I'd like rc to show the actual sum, i.e. 3, 6, and 9.

 

Many thanks in advance!

Garnett

 

 

 

 

 


Accepted Solutions
Solution
‎12-19-2017 06:12 PM
Super User
Super User
Posts: 8,278

Re: RESOLVE macro function to sum variables in a dataset

[ Edited ]

Not sure what the macro has to do with the problem. But if you want to generate code then first generate the code to a file and then use %INCLUDE to make it part of your data step.

proc sort data=in(keep=x) out=xlist nodupkey;
  by x ;
run;

filename code temp;
data _null_;
  set xlist end=eof;
  file code ;
  if _n_=1 then put 'select (x);' ;
  put 'when (' x :$quote. ') rc=%vtest(' x ');' ;
  if eof then put 'end;' ;
run;

options mprint;
data want ;
  set in ;
%include code / source2 ;
run;

Result.

 

1271  options mprint;
1272  data want ;
1273    set in ;
1274  %include code / source2 ;
NOTE: %INCLUDE (level 1) file CODE is file C:\Users\...\#LN00069.
1275 +select (x);
1276 +when ("a" ) rc=%vtest(a );
MPRINT(VTEST):  0 +a_1 +a_2 +a_3
1277 +when ("b" ) rc=%vtest(b );
MPRINT(VTEST):  0 +b_1 +b_2 +b_3
1278 +when ("c" ) rc=%vtest(c );
MPRINT(VTEST):  0 +c_1 +c_2 +c_3
1279 +end;
NOTE: %INCLUDE (level 1) ending.
1280  run;

NOTE: There were 3 observations read from the data set WORK.IN.
NOTE: The data set WORK.WANT has 3 observations and 11 variables.

 

View solution in original post


All Replies
Respected Advisor
Posts: 3,271

Re: RESOLVE macro function to sum variables in a dataset

[ Edited ]

Well this certainly gets complicated, and I hope you are right that you can't use ARRAYS in your real-world situation, but I am skeptical.

 

However, you seem to have over-complicated things even in the macro case. I believe this is what you want:

 

data test_data;
	set in;
   		rc = %vtest(a); 
run;

,  

--
Paige Miller
Occasional Contributor
Posts: 6

Re: RESOLVE macro function to sum variables in a dataset

Posted in reply to PaigeMiller

Thanks very much for looking at this! Your solution just resolves to a_1+a_2+a_3 for each row of the dataset. I'd like the summation over variables that can be defined by column x of the input dataset.

Super User
Posts: 24,004

Re: RESOLVE macro function to sum variables in a dataset

It's not clear how you plan to define the rules and function - that's important.If you can provide some more details we can provide better options.

Super User
Posts: 24,004

Re: RESOLVE macro function to sum variables in a dataset

Wouldn't IML maybe be a better approach in this case?

Occasional Contributor
Posts: 6

Re: RESOLVE macro function to sum variables in a dataset

Thanks for looking at this, Reeza.

IML might be the approach I take. I'd like to find a datastep + macro solution if possible.

Super User
Posts: 13,941

Re: RESOLVE macro function to sum variables in a dataset


gnet wrote:

Thanks for looking at this, Reeza.

IML might be the approach I take. I'd like to find a datastep + macro solution if possible.


It might help to describe exactly what you are attempting. Summing within "columns", variables in SAS terminology, is usually done in either a summary procedure such as Proc Means or Summary. Summing multiple variables often done in data step or Proc SQL. Depending on needs Proc Report maybe able to do both.

 

I will say that your example data is poor as you could sums of 3,  either within observation ("row") by adding a_1+a_2+a_3 OR adding the three rows for any of a_1, a_2 or a_3.

 

Example:

proc means data=in;
   var _numeric_;
   output out=summary (drop=_:) sum=;
run;

Will sum each variable in the example data set.

 

If then need to sum the variables of a, b and c groups:

data want;
   set summary;
   suma= sum(of a:);
   sumb= sum(of b:);
   sumc= sum(of c:);
   keep sum: ;
run;

That proc means and data step could be turned into a macro if needed but seeing an actual use case with data that lets us know which way and specific sum should go might help.

 

Super User
Super User
Posts: 8,278

Re: RESOLVE macro function to sum variables in a dataset

If you want macro code to add numbers you need to pass the numbers to the macro environment.  It will know nothing of the data step variable A_1, let alone what value it has for this iteration of the data step.

 

How many possible values of X are there?

 

data want ;
  set in ;
  select (x);
    when ('a') result = sum(of a_:);
    when ('b') result = sum(of b_:);
    when ('c') result = sum(of c_:);
  end;
run;
Occasional Contributor
Posts: 6

Re: RESOLVE macro function to sum variables in a dataset

Thanks Tom!

There may be thousands of values of 'x', but more importantly the macro %vtest may be an arbitrary function and not just a sum. I used the sum in the example to make it easier to present.

Thanks again,

Garnett

Super User
Posts: 2,061

Re: RESOLVE macro function to sum variables in a dataset

[ Edited ]

You could;

1. Transpose

2. extract letter prefix from _name_ variable, this forms a group

3. look up with hash or any look up that you are comfortable with and execute sum

 

@gnet Try this, if you still can't find it, I can show a demo using your sample piece shortly after my class

Occasional Contributor
Posts: 10

Re: RESOLVE macro function to sum variables in a dataset

[ Edited ]

You could follow Tom's example but create a loop to go through all your values of x then write whichever generic function you use instead of sum.

**create array with distinct values of X;
proc sql noprint;
    select distinct(x) into :ValuesOfX separated by ' '
    from in;
%let NumValuesOfX = &SqlObs.;
    
**Generate test_data;
%macro test;

data test_data;
    set in;
    select(x);
        %do i=1 %to &NumValuesOfX.;
            %let XVal=%scan(&ValuesOfX., &i.);
            when ("&XVal.") rc= sum(of &XVal._:);    
        %end;
    end;
run;

proc print data= test_data noobs;
run;

%mend;    
%test;
Super User
Posts: 13,941

Re: RESOLVE macro function to sum variables in a dataset


gnet wrote:

Thanks Tom!

There may be thousands of values of 'x', but more importantly the macro %vtest may be an arbitrary function and not just a sum. I used the sum in the example to make it easier to present.

Thanks again,

Garnett


"Thousands of values of 'X' " has the potential to have issues related to "thousands of macro variables" and having to adjust macro memory settings and other issues.

 

Actually "sum" using  +&xVal._&I construct is a very poor approach because if any single variable is missing then the result is missing, unless that is your desired behavior.

Perhaps you really want to use Proc FCMP to make your function which does allow passing an array of values and returning a single value

Occasional Contributor
Posts: 10

Re: RESOLVE macro function to sum variables in a dataset

You could follow Tom's example and create a loop to go over all your X values and then substitute the sum with any generic fucntion you are using.

 

**create array with distinct values of X;
proc sql noprint;
	select distinct(x) into :ValuesOfX separated by ' '
	from in;
%let NumValuesOfX = &SqlObs.;
	
**Generate test_data;
%macro test;

data test_data;
	set in;
	select(x);
		%do i=1 %to &NumValuesOfX.;
			%let XVal=%scan(&ValuesOfX., &i.);
			when ("&XVal.") rc= sum(of &XVal._:);	
		%end;
	end;
run;

proc print data= test_data noobs;
run;

%mend;	
%test;

Super User
Posts: 2,061

Re: RESOLVE macro function to sum variables in a dataset

Here is the demo, if you can play with this approach, you can bid good bye to the macro processor Smiley Happy

 

 

 

data in;

     input x $ a_1 a_2 a_3 b_1 b_2 b_3 c_1 c_2 c_3;

datalines;

a    1    1    1    2    2    2    3    3    3

b    1    1    1    2    2    2    3    3    3

c    1    1    1    2    2    2    3    3    3

;

run;

 

proc transpose data=in out=w;

id x;

run;

 

data want;

set w;

k=char(_name_,1);

run;

 

data final_want;

if _n_=1 then do;

if 0 then set want(keep=a k);

dcl hash h(dataset:'want(keep=a k)', multidata: 'y', ordered: 'y');

 h.definekey('k');

 h.definedata(all:'yes');

 h.definedone();

 end;

 set in(keep=x);

 sum=0;

 do while(h.do_over(key:x) eq 0);

  sum+a;

end;

output;

run;

Occasional Contributor
Posts: 6

Re: RESOLVE macro function to sum variables in a dataset

Posted in reply to novinosrin

Hi there,

This is a fascinating bit of code, and involves hash programming that I've never used. Looks like I have work to do!

Thanks again!

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 17 replies
  • 646 views
  • 2 likes
  • 7 in conversation