BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Parul
Fluorite | Level 6

I need to create a matrix having 734 rows and 17 columns the row and column total are fixed how do i create such matrix having random values obeying the restriction i mentioned...I am using proc iml and this code but it is not working properly

proc iml;
use outd;
read all var {outd};
use ctrstat1;
read all var {count_sum};

a= j(734,17);
do i=1 to 12478;
call randseed(123);
call randgen(a, "Uniform");
end;

a[+, ] = count_sum ;
a[ ,+] = outd ;
print a;
quit;

 

outd is the vector having total of each row and count_sum is the vector having sum of each column..Please Please help

1 ACCEPTED SOLUTION

Accepted Solutions
Rick_SAS
SAS Super FREQ

Ahh! Now I see what you want. I apologize for not understanding earlier. I thought you were trying to do something more complicated.

 

The null model for a table is usually assumed to be the model of independence. You can easily generate an independent table by forming the (outer) product of the side marginal (sum across columns) with the bottom marginal (sum down the rows). For example, here is program that produces a 7x3 table from the 7-element and 3-element marginal sums:

proc iml;
side = {34, 23, 21, 16, 16, 21, 12};
bottom = {60 52 31};
total = sum(side);  /* = sum(bottom); */

table = side*bottom/side[+];   /* null model of independence */
print table;

You can learn about other null models by looking at the documentation for the IPF function.

 

View solution in original post

11 REPLIES 11
AnnaBrown
Community Manager

Hi Parul,

 

Thanks for your question. I've moved your question to the SAS/IML forum as more experts will be able to help here.

 

Best,

Anna


Join us for SAS Community Trivia
SAS Bowl XXIX, The SAS Hackathon
Wednesday, March 8, 2023, at 10 AM ET | #SASBowl

Rick_SAS
SAS Super FREQ

Are you trying to simulate a frequency table?  That is, are the entries of the matrix integers with specified row and column sums?

 

For example, if you tell me you want a 2x3 table with row sums {4 5 4} and column sums {6, 7}, a valid table would be

m = {1 2 3,

        3 3 1};

     

Is that what you want? Otherwise, please provide an small example that shows what you want.

Parul
Fluorite | Level 6

Thanks Rick and Anna for replying me...

Rick i want to create a matrix of 734 rows and 17 columns i have the total of each row and each column in the form of dataset as in and out degree. Now what i want is i want to fill in this 734X 17 matrix with random numbers not integers such that the row and column total will be my in and out degree. Hope i clarified i question

Parul
Fluorite | Level 6

Rick yes you correctly understood. I want this only could you pls help me for coding this.

Rick_SAS
SAS Super FREQ

I've never simulated a contingency table with two fixed marginals, but I don't think it is easy.  Looking at the statistical literature reveals a handful of paper with complicated algorithms.

 

Do you need only one of these matrices, or do you need thousands of them?  The second case is more difficult, since it requires creating a sampling algorithm that draws uniformly from the set of all contingency tables with fixed marginals, which is a hard problem.

The first case is simpler because it allows you to contruct an approximation to the matrix, and then "fix it up" by examing places where the row/col sums are not corerct and applying an iterative refinement.

 

Since this is a complicated problem, I have to ask what you are trying to accomplish? What is your ultimate goal?

Parul
Fluorite | Level 6

I just want one such matrix which will be my null model. I know there will be many different solution of this for same set on in and out degree. what i want to do is to take average of all those simulated values as my null model.If not in SAS can you suggest where can i do this

Rick_SAS
SAS Super FREQ

Ahh! Now I see what you want. I apologize for not understanding earlier. I thought you were trying to do something more complicated.

 

The null model for a table is usually assumed to be the model of independence. You can easily generate an independent table by forming the (outer) product of the side marginal (sum across columns) with the bottom marginal (sum down the rows). For example, here is program that produces a 7x3 table from the 7-element and 3-element marginal sums:

proc iml;
side = {34, 23, 21, 16, 16, 21, 12};
bottom = {60 52 31};
total = sum(side);  /* = sum(bottom); */

table = side*bottom/side[+];   /* null model of independence */
print table;

You can learn about other null models by looking at the documentation for the IPF function.

 

Parul
Fluorite | Level 6

Thank you so much Rick the solution is what I want.....just little more if u could help actually i want the random integers ..I tried ceil and floor function with you code but it is not giving exact sum for some of the rows 

IanWakeling
Barite | Level 11

May be you could try selective use of either ceil() or floor() on each element of the matrix.  You could force either the row sums or the column sums to be correct and it is likely the other will be very close.   For example:

 

proc iml;

start rowround( x );
  s = x;
  do i = 1 to nrow(x);
    d = x[i, ] - floor(x[i, ]);
    s[i, ] = rank(d) > round(ncol(x) - sum(d));
  end;
  do i = 1 to nrow(x); do j = 1 to ncol(x);
    if s[i,j] then s[i,j] = ceil(x[i,j]);
              else s[i,j] = floor(x[i,j]);
  end; end;
  return(s);
finish;

side = {34, 23, 21, 16, 16, 21, 12};
bottom = {60 52 31};
total = sum(side);  /* = sum(bottom); */

table = side*bottom/side[+];   /* null model of independence */
print table;

table = rowround(table);

print table ,, bottom, (table[+,]), side (table[,+]);

quit;

 

 

Rick_SAS
SAS Super FREQ

@Parul There is no randomness to the solution that I posted, which is why I was confused earlier. This thread's title is "Random matrix..." so I initially assumed you wanted to draw a table "uniformly at random" from the set of all tables that have the given marginals. It is possible to sample uniformly at random, but it requires more effort, so I didn't want to pursue it until I understood why you need it and what you intend to do with it.

 

It sounds like you don't need a random matrix, but you only require ANY matrix that satisfies the marginal constraints (There's a difference!). If true, then use the IPF function.  The doc for the IPF function also has an example of a "greedy algorithm" that will satisfy your requirement.

 

Rick_SAS
SAS Super FREQ

After giving this issue A LOT of thought, I wrote a series of blog posts that culminated in the article "Simulate contingency tables with fixed row and column sums in SAS."  This turned out to be a very interesting topic, so thanks to @Parul for asking the question.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

From The DO Loop
Want more? Visit our blog for more articles like these.
Discussion stats
  • 11 replies
  • 3490 views
  • 3 likes
  • 4 in conversation