BookmarkSubscribeRSS Feed
Calcite | Level 5
hello all -

i would like to create matrix that looks like this given the number of columns n. suppose n=4 for now

1 1 1 1 1 1
1 2 1 1 1 1
1 3 1 1 1 1
1 4 1 1 1 1
2 2 0 1 1 1
2 3 0 1 1 1
2 4 0 1 1 1
3 3 0 0 1 1
3 4 0 0 1 1
4 4 0 0 0 1

so pretty the big matrix is always going to contain n+2 columns. the first 2 columns correspond to ordered combinations of the 1 to n and the respective row repeat n times then n-1 times then n-2. i know this is very confusing. i might be explaining it wrong too. here is a better way to say it with the example.

so the first two rows consist of all the different combinations (1 1) (1 2) (1 3) (1 4) (2 2) (2 3) (2 4) (3 3) etc.

now there are 4 pairs that have 1 first, 3 pairs that have 2 first, 2 pairs with 3 first and 1 pair with 4 first. so the number of pairs determines how many rows to add.

now the first number of the pair determines where to start the 1. for example (2 2) says start the 1 in the 2nd position 0 1 1 1. (2 3) says start the 1 in the 2nd position 0 1 1 1

and so on

im sorry if this is confusing but i cant figure out how to explain it. there is a clear structure to the matrix.

i just want to have a macro that i can give the number of observations n, and it will spit out the n(n+1)/2 by n+2 matrix

Yes, this is doable. Basically, the algorithm is
1) Figure out the dimensions of the matrix
2) Allocate a matrix of ones
3) Generate the first two columns
4) Fill in the zeros

I was going to post some code, but I have a question about how the first two columns are formed. You say "the first 2 columns correspond to ordered combinations," so I was going to suggest using the LEXCOMB function in Base SAS. However, there are only 6 distinct combinations (4 choose 2) of four elements taken two at a time. Combinations do not include the terms (1,1), (2,2), (3,3), or (4,4).

Then I thought, maybe you mean "permutations taken two at a time," but those don't include the (1,1), (2,2), (3,3), or (4,4) terms either, plus they DO include terms like (2,1), (3,1), (3,2), (4,1), etc.

So my questions:
1) What are the first two columns? Are they all combinations plus the constant terms?

2) What would the first few columns look like for 4 choose 3? Would you include (1,1,1), (1,1,2), ..., (1,2,1),...? Or would you exclude (1,2,1) because you already have (1,1,2)?

3) Maybe context would help. Are you trying to do something related to all two-way interactions in a regression model? If so, are the variables continuous or nominal?
I think it will be easier for you to generate the first two columns in the DATA step, and then use IML for the rest of the matrix. Maybe this is close to what you want?

%let n = 4;
%let k = 2;

data c&n._&k (keep=x1-x&k);
array x{&n};
ncomb = comb(&n, &k);
do i = 1 to &n; x[ i ]=i; end;
do i=1 to ncomb;
rc=lexcomb(i, &k, of x
  • );
    /** add in the repeated terms **/
    do i = 1 to &n;
    do j = 1 to &k; x[ j ]=i; end;
    proc sort; by x1-x&k; run;

    proc iml;
    varNames = "x1":"x&k";
    use c&n._&k;
    read all var varNames into c;
    close c&n._&k;

    n = &n; k = &k;
    nrows = nrow(c);
    x = j(nrows, n+2, 1);
    x[, 1:2] = c;
    do i = n+1 to nrows;
    idx = 3:(c[ i,1 ]+1);
    x[ i, idx ] = 0;

    print x;
  • trekvana
    Calcite | Level 5
    Rick -

    so there will always only be 2 first columns corresponding to the covariance matrix parameter indices. so imagine a 4x4 covariance matrix (which is symmetric) with the indices (1,1) (1,2) (1,3) (1,4) (2,2) (2,3) , ect. the fact that it is symmetric allows us not to worry about (2,1) (3,1) (3,2) and so on. now that i think about it ordered permuations was not the correct way to describe that.

    once we have the indices then we add 4 more columns (or in general if the covariance matrix is nxn then we add n more columns) corresponding to ones starting at the first number of the index.

    so the row starting with (2,3) will be 2 3 0 1 1 1

    thus we will have zeros up the first index-1 (in this example 2-1=1 and then starting at 2 we have all ones Message was edited by: trekvana
    Calcite | Level 5

    thanks for the code. it works great. this is exactly what i wanted. just plug in n and get the matrix

    cheers Message was edited by: trekvana
    Calcite | Level 5
    one more question. how can output the x matrix into a sas data set, say A, with the columns labeled Parm Row Col1 Col2 ... Coln

    i know i can create the actual matrix like this but not sure how to rename the columns

    create A from x;
    append from x;
    close A; Message was edited by: trekvana
    SAS Super FREQ
    varNames = {"Row" "Col"} || ("Col1":"Col&n");
    create A from x[colname=varNames];
    append from x;
    close A;
    SAS Super FREQ
    Great! I mistakenly thought that "k" could also vary, but I now understand that k=2 for your problem. In that case, we don't really need the DATA step code; we could do it entirely in IML.



    Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

    If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. 

    Register now!

    Multiple Linear Regression in SAS

    Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

    Find more tutorials on the SAS Users YouTube channel.

    From The DO Loop
    Want more? Visit our blog for more articles like these.
    Discussion stats
    • 7 replies
    • 2 in conversation