DATA Step, Macro, Functions and more

Automatic generation of all possible interaction terms (no duplicates)

Accepted Solution Solved
Reply
Contributor
Posts: 43
Accepted Solution

Automatic generation of all possible interaction terms (no duplicates)

[ Edited ]

Hi all, I have been trying to find a way to create all possbile interactions for a procedure. The problem is, for example a*b interaction is same as b*a and I couldn't find a reasonable way to exclude these duplicates. Here is how I started: 

data data;
	input var $ @@;
	cards; 
		a b c d e f g h i j k 
	;
run;
data temp;
	set data;
run;
proc sql noprint;
	select strip(d.var)||"*"||strip(t.var) into :interactions separated by " "
		from data as d, temp as t;
quit;
%put &interactions;

And here is the result: 

%put &interactions;
a*a a*b a*c a*d a*e a*f a*g a*h a*i a*j a*k b*a b*b b*c b*d b*e b*f b*g b*h b*i b*j b*k c*a c*b
c*c c*d c*e c*f c*g c*h c*i c*j c*k d*a d*b d*c d*d d*e d*f d*g d*h d*i d*j d*k e*a e*b e*c e*d
e*e e*f e*g e*h e*i e*j e*k f*a f*b f*c f*d f*e f*f f*g f*h f*i f*j f*k g*a g*b g*c g*d g*e g*f
g*g g*h g*i g*j g*k h*a h*b h*c h*d h*e h*f h*g h*h h*i h*j h*k i*a i*b i*c i*d i*e i*f i*g i*h
i*i i*j i*k j*a j*b j*c j*d j*e j*f j*g j*h j*i j*j j*k k*a k*b k*c k*d k*e k*f k*g k*h k*i k*j
k*k

Please help!


Accepted Solutions
Solution
‎03-16-2017 11:26 PM
Super User
Posts: 19,815

Re: Automatic generation of all possible interaction terms (no duplicates)

Posted in reply to MetinBulus

And if you do decide to go down that route still, you don't need the temp data set.

 

data data;
	input var $ @@;
	cards; 
		a b c d e f g h i j k 
	;
run;

proc sql noprint;
	select catx('*', d1.var, d2.var) into :interactions separated by " "
		from data as d1, data as d2
        where d1.var > d2.var;

quit;
%put &interactions;

View solution in original post


All Replies
Super User
Posts: 19,815

Re: Automatic generation of all possible interaction terms (no duplicates)

Posted in reply to MetinBulus

What proc are you using? Have you tried the using the | in the model statement?

 

proc glm data=sashelp.cars;
    model mpg_city = horsepower | weight | length | cylinders @2;
run;quit;
Contributor
Posts: 43

Re: Automatic generation of all possible interaction terms (no duplicates)

I am using proc logistic, the code is embedded within a macro and number of variables are not known in advance. This creates a pool of higher order terms to be tried. 

Super User
Posts: 19,815

Re: Automatic generation of all possible interaction terms (no duplicates)

Posted in reply to MetinBulus

MetinBulus wrote:

I am using proc logistic, the code is embedded within a macro and number of variables are not known in advance. This creates a pool of higher order terms to be tried. 


So? All that would change would be the parameter list with the bars in between? Nothing else is required.


Contributor
Posts: 43

Re: Automatic generation of all possible interaction terms (no duplicates)

[ Edited ]

My goal is not to estimate but to create a list from which I can extract (or delete) terms. Solutions provided are sufficient to create such a list after adding quadratic terms. I haven't tried this approach but a dry run in proc logistic or glmmod may help too. Thank you, great suggestions, I am learning a lot along the way.

Super User
Posts: 5,509

Re: Automatic generation of all possible interaction terms (no duplicates)

Posted in reply to MetinBulus

It looks like you should be able to add one of these two WHERE clauses:

 

where d.var < t.var

 

where d.var <= t.var

 

Depends on whether you want a*a, b*b, c*c, etc. as part of the list.

Contributor
Posts: 43

Re: Automatic generation of all possible interaction terms (no duplicates)

Posted in reply to Astounding

Yes, they should be included. Alphabetic names are arbitrary. 

Solution
‎03-16-2017 11:26 PM
Super User
Posts: 19,815

Re: Automatic generation of all possible interaction terms (no duplicates)

Posted in reply to MetinBulus

And if you do decide to go down that route still, you don't need the temp data set.

 

data data;
	input var $ @@;
	cards; 
		a b c d e f g h i j k 
	;
run;

proc sql noprint;
	select catx('*', d1.var, d2.var) into :interactions separated by " "
		from data as d1, data as d2
        where d1.var > d2.var;

quit;
%put &interactions;
Contributor
Posts: 43

Re: Automatic generation of all possible interaction terms (no duplicates)

Thanks it is much better this way.

Contributor
Posts: 43

Re: Automatic generation of all possible interaction terms (no duplicates)

[ Edited ]

I was not aware that this solution could work everytime, after I failed at others. Thanks a lot! Amazing! 

Respected Advisor
Posts: 3,799

Re: Automatic generation of all possible interaction terms (no duplicates)

Posted in reply to MetinBulus

MetinBulus wrote:

Hi all, I have been trying to find a way to create all possbile interactions for a procedure. The problem is, for example a*b interaction is same as b*a and I couldn't find a reasonable way to exclude these duplicates. Here is how I started: 

data data;
	input var $ @@;
	cards; 
		a b c d e f g h i j k 
	;
run;
data temp;
	set data;
run;
proc sql noprint;
	select strip(d.var)||"*"||strip(t.var) into :interactions separated by " "
		from data as d, temp as t;
quit;
%put &interactions;

And here is the result: 

%put &interactions;
a*a a*b a*c a*d a*e a*f a*g a*h a*i a*j a*k b*a b*b b*c b*d b*e b*f b*g b*h b*i b*j b*k c*a c*b
c*c c*d c*e c*f c*g c*h c*i c*j c*k d*a d*b d*c d*d d*e d*f d*g d*h d*i d*j d*k e*a e*b e*c e*d
e*e e*f e*g e*h e*i e*j e*k f*a f*b f*c f*d f*e f*f f*g f*h f*i f*j f*k g*a g*b g*c g*d g*e g*f
g*g g*h g*i g*j g*k h*a h*b h*c h*d h*e h*f h*g h*h h*i h*j h*k i*a i*b i*c i*d i*e i*f i*g i*h
i*i i*j i*k j*a j*b j*c j*d j*e j*f j*g j*h j*i j*j j*k k*a k*b k*c k*d k*e k*f k*g k*h k*i k*j
k*k

Please help!


This example right out of the documentation for LEXCOMB.

 

29         data _null_;
30            array x[11] $1 ('a' 'b' 'c' 'd' 'e' 'f' 'g' 'h' 'i' 'j' 'k');
31            n=dim(x);
32            k=2;
33            ncomb=comb(n, k);
34            do j=1 to ncomb;
35               call lexcomb(j, k, of x[*]);
36               put j 5. +3 x1 +(-1) '*' x2;
37               end;
38            run;

    1   a*b
    2   a*c
    3   a*d
    4   a*e
    5   a*f
    6   a*g
    7   a*h
    8   a*i
    9   a*j
   10   a*k
   11   b*c
   12   b*d
   13   b*e
   14   b*f
   15   b*g
   16   b*h
   17   b*i
   18   b*j
   19   b*k
   20   c*d
   21   c*e
   22   c*f
   23   c*g
   24   c*h
   25   c*i
   26   c*j
   27   c*k
   28   d*e
   29   d*f
   30   d*g
   31   d*h
   32   d*i
   33   d*j
   34   d*k
   35   e*f
   36   e*g
   37   e*h
   38   e*i
   39   e*j
   40   e*k
   41   f*g
   42   f*h
   43   f*i
   44   f*j
   45   f*k
   46   g*h
   47   g*i
   48   g*j
   49   g*k
   50   h*i
   51   h*j
   52   h*k
   53   i*j
   54   i*k
   55   j*k
Contributor
Posts: 43

Re: Automatic generation of all possible interaction terms (no duplicates)

Posted in reply to data_null__

This is great, thanks! 

Respected Advisor
Posts: 3,799

Re: Automatic generation of all possible interaction terms (no duplicates)

Posted in reply to MetinBulus

Another way that @Reeza alluded to it to use GLMMOD to create and OUTPARM data set.  This also include the one-way effects which you may or may not want and can remove easily enough.  Depending on what you're doing this output may be more useful.

 

data glmmod;
   array x[11] a b c d e f g h i j k (11*1);
   y = 1;
   output;
   run;
proc glmmod outparm=parm noprint;
   class a--k;
   model y = a|b|c|d|e|f|g|h|i|j|k @2;
   run;
proc print;
   run; 

Capture.PNG

Contributor
Posts: 43

Re: Automatic generation of all possible interaction terms (no duplicates)

Posted in reply to data_null__
This might work as well, it is easier to generate quadratic terms by subsetting and multiplying main effects to complete the list.
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 13 replies
  • 250 views
  • 5 likes
  • 4 in conversation