turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-05-2017 09:42 AM

Hello All,

I am studying the two-way interaction effects of multiple variables. If there are 17 interested variables (say x1,x2,…,x17), there will be 17*16/2=136 two-way interaction terms. How can I create all those 136 two-way interaction terms and store them into a data table?

Thanks! Any ideas will be much appreciated!

Accepted Solutions

Solution

07-05-2017
05:41 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-05-2017 01:12 PM

I see. So you are building the interaction of a variable with itself. Again I emphasize that it will be easier to use one of the dedicated SAS procedures for this. You can use the SUBMIT statement to call a SAS procedure from within a SAS/IML program.

That said, if c=HDIR(a,a), the columns you want to keep are exactly the indices (in row-major order) of the strictly upper-triangular elements of a p x p matrix, where p=ncol(a). In the following statements, I use some SAS/IML functions that might be new to you:

- The EXPANDGRID function returns the Cartesian product of two vectors
- The ROW and COL functions can be used to find the indices of the upper-triangular elements

The program just figures out which columns correspond to the desired interactions. I've also generated example names, which is not necessary but seems useful.

```
proc iml;
a = {1 2 3 5};
c = hdir(a,a); /* all two-way interactions */
names = "a":"d";
p = ncol(a);
interactNames = expandgrid(names, names); /* names of all two-way interactions */
fullNames = interactNames[,1] + "*" + interactNames[,2];
print c[colname=fullNames];
M = shape(1:p##2, p);
keepIdx = loc( row(M) < col(M) );
partInteract = c[ ,keepIdx];
partNames = interactNames[keepIdx,1] + "*" + interactNames[keepIdx,2];
print partInteract[colname=partNames];
```

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-05-2017 10:03 AM

%mktex(2 ** 17, n=32) /* make a design */ * code it; proc transreg data=design design; model class(%macro int; x1 %do i = 2 %to 17; | x&i %end; @2 %mend; %int); output out=coded; run; proc print; run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-07-2017 10:29 AM

Thanks very much WarrenKuhfeld, but I could't see your code clearly. Could you please make up a matrix (say a={1 2 3 5}) and retype the code?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-05-2017 10:11 AM - edited 07-06-2017 11:39 AM

You can use any of several SAS procedures to write a design matrix.** For a summary and example, see "Four ways to create a design matrix in SAS."**

Since you posted this in the SAS/IML forum, you can also use the HDIR function to create interaction effects between columns of two design matrices. See "Dummy variables in SAS/IML." But you know this already since you previously asked about the HDIR function.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-05-2017 10:18 AM

Hi Rick, thanks for your help. I figured out how to use HDIR when matrices have missing value.

However, a new issue comes out. Since we have 17 interested variables, we will get 17*17=289 terms when we use HDIR, but what we want is just 17*16/2=136 two-way interaction terms. The result by using HDIR has repeated interaction terms besides quadratic terms. Say if we have 3 variables a,b,and c, HDIR will give us aa,ab,ac,ba,bb,bc,ca,cb,and cc.

Do you have any suggestions on this?

Solution

07-05-2017
05:41 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-05-2017 01:12 PM

I see. So you are building the interaction of a variable with itself. Again I emphasize that it will be easier to use one of the dedicated SAS procedures for this. You can use the SUBMIT statement to call a SAS procedure from within a SAS/IML program.

That said, if c=HDIR(a,a), the columns you want to keep are exactly the indices (in row-major order) of the strictly upper-triangular elements of a p x p matrix, where p=ncol(a). In the following statements, I use some SAS/IML functions that might be new to you:

- The EXPANDGRID function returns the Cartesian product of two vectors
- The ROW and COL functions can be used to find the indices of the upper-triangular elements

The program just figures out which columns correspond to the desired interactions. I've also generated example names, which is not necessary but seems useful.

```
proc iml;
a = {1 2 3 5};
c = hdir(a,a); /* all two-way interactions */
names = "a":"d";
p = ncol(a);
interactNames = expandgrid(names, names); /* names of all two-way interactions */
fullNames = interactNames[,1] + "*" + interactNames[,2];
print c[colname=fullNames];
M = shape(1:p##2, p);
keepIdx = loc( row(M) < col(M) );
partInteract = c[ ,keepIdx];
partNames = interactNames[keepIdx,1] + "*" + interactNames[keepIdx,2];
print partInteract[colname=partNames];
```

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-06-2017 09:05 AM

Why not pick up the right columns ?

```
proc iml;
a = {1 2 3 5};
c = hdir(a,a);
n=ncol(a);
bad_idx=do(1,n#n,n+1);
good_idx=setdif(1:n#n,bad_idx);
want=c[,good_idx];
print c,want;
quit;
```

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-06-2017 09:41 AM

Lots of good solutions.

If it doesn't have to be an IML solution, PROC GLMMOD does the job as well.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-06-2017 11:02 AM

Sorry to disagree, but personally, I think GLMMOD does not provide a *good* solution, although it can certainly work. You will understand my biases as you read further. GLMMOD was simply designed to provide some mechanism to output to a data set the same coding that GLM uses. There is no built-in correspondence between the columns in the matrix and the rows when there are BY variables, and there is nothing descriptive in the column names. The test below illustrates.

Transreg was designed to enable various types of coding and provide sensible names and labels along with several options that provide user control over the names and labels. Unlike GLMMOD, a variable always means the same thing throughout the data set.

Much later, when various forms of coding were added to other modeling procedures, they followed the lead of transreg (although not the syntax of transreg), but went beyond what transreg did in many ways. Different variable naming and labeling "cults" were provided as options (although they were never referred to that way outside of SAS).

When I learned linear models in graduate school in the early 1980s at UNC from Ron Helms, there was no automatic way in SAS to get alternative codings (reference cell, LTFR, cell mean, effects, splines, separate slopes and intercepts, separate slopes same intercept, multiple parallel lines, etc.) into a data set. I made providing those codings a priority when I wrote transreg starting in the mid 1980s. I also made transreg handle some esoteric codings common in discrete choice modeling. As I said, now other procs go way beyond what I did in transreg, but GLMMOD is not one of them.

Sorry for the history lesson, but I figure someone out there might find this SAS modeling history interesting.

%mktex(3 ** 6, n=18, out=d1) %mktex(2 ** 6, n=12, out=d2) data x; set d1 d2(in=i); by = i; y = _n_; run; proc print; run; proc glmmod data=x outdesign=g1(drop=y) noprint; class x1-x6; model y = x1-x6; run; proc glmmod data=x outdesign=g2(drop=by y) noprint; class x1-x6; model y = x1-x6; by by; run; proc compare error note briefsummary criterion=1e-10 data=g1 compare=g2 method=relative(1); run; proc print data=g1; run; proc print data=g2; run; proc contents varnum; ods select position; run; proc transreg data=x design replace; model class(x1-x6); output out=t1(drop=_:); run; proc transreg data=x design replace; model class(x1-x6); output out=t2(drop=_: by); by by; run; proc compare error note briefsummary criterion=1e-10 data=t1 compare=t2 method=relative(1); run; proc print data=t1; run; proc print data=t2; run; proc contents varnum; ods select position; run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-10-2017 08:56 AM

WarrenKuhfeld wrote:

Sorry for the history lesson, but I figure someone out there might find this SAS modeling history interesting.

This is very useful information, and I have bookmarked this thread. Thanks!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-06-2017 08:00 PM

Thanks for your reply. Any more specific suggestions?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-06-2017 07:50 PM

Thanks very much for your reply, Ksharp. The result we want should have 6 elements {1,2,3,5,6,10,15}, but yours returns more than that. I think the problem is that when you define bad_idx, you only include the indexes of the diagonal elements.

Could you please revise your code?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-07-2017 08:06 AM

OK. Same as Rick's code.

```
proc iml;
a = {1 2 3 5};
c = hdir(a,a);
n=ncol(a);
idx=shape(do(1,n#n,1),n);
want_idx=loc( row(idx) < col(idx) );
want = c[ ,want_idx];
print c,want;
quit;
```

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-07-2017 09:59 AM

Great! This works!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-07-2017 12:11 PM

This is probably a little more efficient and gets to the index in one line!

```
n = ncol(a);
want_idx = cusum(vech( j(n-1,n-1) + diag(1:(n-1)) ));
```