BookmarkSubscribeRSS Feed
trekvana
Calcite | Level 5
hello all -

i would like to create matrix that looks like this given the number of columns n. suppose n=4 for now

1 1 1 1 1 1
1 2 1 1 1 1
1 3 1 1 1 1
1 4 1 1 1 1
2 2 0 1 1 1
2 3 0 1 1 1
2 4 0 1 1 1
3 3 0 0 1 1
3 4 0 0 1 1
4 4 0 0 0 1

so pretty the big matrix is always going to contain n+2 columns. the first 2 columns correspond to ordered combinations of the 1 to n and the respective row repeat n times then n-1 times then n-2. i know this is very confusing. i might be explaining it wrong too. here is a better way to say it with the example.

so the first two rows consist of all the different combinations (1 1) (1 2) (1 3) (1 4) (2 2) (2 3) (2 4) (3 3) etc.

now there are 4 pairs that have 1 first, 3 pairs that have 2 first, 2 pairs with 3 first and 1 pair with 4 first. so the number of pairs determines how many rows to add.

now the first number of the pair determines where to start the 1. for example (2 2) says start the 1 in the 2nd position 0 1 1 1. (2 3) says start the 1 in the 2nd position 0 1 1 1

and so on

im sorry if this is confusing but i cant figure out how to explain it. there is a clear structure to the matrix.

i just want to have a macro that i can give the number of observations n, and it will spit out the n(n+1)/2 by n+2 matrix

cheers
7 REPLIES 7
Rick_SAS
SAS Super FREQ
Yes, this is doable. Basically, the algorithm is
1) Figure out the dimensions of the matrix
2) Allocate a matrix of ones
3) Generate the first two columns
4) Fill in the zeros

I was going to post some code, but I have a question about how the first two columns are formed. You say "the first 2 columns correspond to ordered combinations," so I was going to suggest using the LEXCOMB function in Base SAS. However, there are only 6 distinct combinations (4 choose 2) of four elements taken two at a time. Combinations do not include the terms (1,1), (2,2), (3,3), or (4,4).

Then I thought, maybe you mean "permutations taken two at a time," but those don't include the (1,1), (2,2), (3,3), or (4,4) terms either, plus they DO include terms like (2,1), (3,1), (3,2), (4,1), etc.

So my questions:
1) What are the first two columns? Are they all combinations plus the constant terms?

2) What would the first few columns look like for 4 choose 3? Would you include (1,1,1), (1,1,2), ..., (1,2,1),...? Or would you exclude (1,2,1) because you already have (1,1,2)?

3) Maybe context would help. Are you trying to do something related to all two-way interactions in a regression model? If so, are the variables continuous or nominal?
Rick_SAS
SAS Super FREQ
I think it will be easier for you to generate the first two columns in the DATA step, and then use IML for the rest of the matrix. Maybe this is close to what you want?

%let n = 4;
%let k = 2;

data c&n._&k (keep=x1-x&k);
array x{&n};
ncomb = comb(&n, &k);
do i = 1 to &n; x[ i ]=i; end;
do i=1 to ncomb;
rc=lexcomb(i, &k, of x
  • );
    output;
    end;
    /** add in the repeated terms **/
    do i = 1 to &n;
    do j = 1 to &k; x[ j ]=i; end;
    output;
    end;
    run;
    proc sort; by x1-x&k; run;

    proc iml;
    varNames = "x1":"x&k";
    use c&n._&k;
    read all var varNames into c;
    close c&n._&k;

    n = &n; k = &k;
    nrows = nrow(c);
    x = j(nrows, n+2, 1);
    x[, 1:2] = c;
    do i = n+1 to nrows;
    idx = 3:(c[ i,1 ]+1);
    x[ i, idx ] = 0;
    end;

    print x;
  • trekvana
    Calcite | Level 5
    Rick -

    so there will always only be 2 first columns corresponding to the covariance matrix parameter indices. so imagine a 4x4 covariance matrix (which is symmetric) with the indices (1,1) (1,2) (1,3) (1,4) (2,2) (2,3) , ect. the fact that it is symmetric allows us not to worry about (2,1) (3,1) (3,2) and so on. now that i think about it ordered permuations was not the correct way to describe that.

    once we have the indices then we add 4 more columns (or in general if the covariance matrix is nxn then we add n more columns) corresponding to ones starting at the first number of the index.

    so the row starting with (2,3) will be 2 3 0 1 1 1

    thus we will have zeros up the first index-1 (in this example 2-1=1 and then starting at 2 we have all ones Message was edited by: trekvana
    trekvana
    Calcite | Level 5
    Rick-

    thanks for the code. it works great. this is exactly what i wanted. just plug in n and get the matrix

    cheers Message was edited by: trekvana
    trekvana
    Calcite | Level 5
    one more question. how can output the x matrix into a sas data set, say A, with the columns labeled Parm Row Col1 Col2 ... Coln

    i know i can create the actual matrix like this but not sure how to rename the columns

    create A from x;
    append from x;
    close A; Message was edited by: trekvana
    Rick_SAS
    SAS Super FREQ
    varNames = {"Row" "Col"} || ("Col1":"Col&n");
    create A from x[colname=varNames];
    append from x;
    close A;
    Rick_SAS
    SAS Super FREQ
    Great! I mistakenly thought that "k" could also vary, but I now understand that k=2 for your problem. In that case, we don't really need the DATA step code; we could do it entirely in IML.

    sas-innovate-2024.png

    Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

    Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

     

    Register now!

    Multiple Linear Regression in SAS

    Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

    Find more tutorials on the SAS Users YouTube channel.

    From The DO Loop
    Want more? Visit our blog for more articles like these.
    Discussion stats
    • 7 replies
    • 1139 views
    • 0 likes
    • 2 in conversation