Solved: SAS IML: HOW TO COMBINE 2-WAY MARGINAL TABLES INTO A 3-WAY TABLE? - Page 2

Ksharp · Posted 10-26-2014 10:54 AM

Ou, Sorry. I just review the link you post. It appeared that is totally different with my thought. I would think about it when I am free.Can't you use the code you find to get what you want ?

/*** -*-mode:simple-sas-*- **************************************************************  DESCRIPTION: IPF2WAY rakes the data in dataset &DSIN. to the row and column controls contained in datasets &ROWCTRLDS. and &COLCTRLDS., respectively.  It does so using the two-way rake also known as Iterative Proportional Fitting and the RAS algorithm.  PROGRAMMERS: webb.sprague@ofm.wa.gov, inspired by Chuck Taylor at the US Census Bureau  DATE STARTED: 2011-08-30  INPUT (DATASETS, NAMES, ETC): DSIN:        Input dataset VAR:         Variables to be raked ROWCTRLDS:   Row control dataset ROWCTRLVAR:  Row control variable (single variable: all controls are in one column) COLCTRLDS:   Column control dataset COLCTRLVAR:  Column control variables  OUTPUT  (DATASETS, NAMES, ETC): DSOUT:       Output dataset  NOTES:  2011-08-31 WS (WARNING): This code has only been tested on a single set of dummy data (below).  Be careful when applying to real data!  2011-08-30 WS (TODO): This macro doesn't have the ID keys functionality of the original one.  Have to merge horizontally, should add in the future...  2011-08-30 WS (TODO): Needs functionality to set zeros to really small numbers (internally defined, I think).  2011-08-30 WS: The algorithm here was coded using Eddie H's presentation on IPF. The original inspiration was from the Census Bureau (2004-ish), but it didn't work except for the stuff to interact with outside datasets.  **********************************************************************************/ %macro IPF2WAY(DSIN=,DSOUT=,ROWCTRLDS=,VAR=,ROWCTRLVAR=,COLCTRLDS=,COLCTRLVAR=); proc iml;  /* tolerance at which to stop iterating -- half a person ... */ TOL=0.5;  /* Read in data, row controls, and column controls.  xxxVAR parameters control which columns get fed into the various matrices */  use &DSIN.; read ALL var {&VAR.} into MATRIX;  use &ROWCTRLDS.; read ALL var {&ROWCTRLVAR.} into ROWCTRL;  use &COLCTRLDS.; read current var {&COLCTRLVAR.} into COLCTRL;  /* use _MATRIX for intermediate and final results, keep original in MATRIX */ _MATRIX = MATRIX;  /* Main loop ... */ /* ...initialize DIFF to large value to force execution of do loop */ DIFF = 1000; /* ... loop until finished. */ do until (DIFF < TOL); _OLDMAT  =  _MATRIX; ROWPR    =  _MATRIX / _MATRIX[,+]; /* divide matrix row-wise by rowsum to get controlled proportions*/ _MATRIX  =  ROWCTRL # ROWPR;        /* multiply controls by proportion to get numbers*/ COLPR    =  _MATRIX / _MATRIX[+,]; /* ditto with columns */ _MATRIX  =  COLCTRL # COLPR; DIFF     =  max(abs(_MATRIX - _OLDMAT)); /* check if want to end loop */ end;  /* output to a new dataset (XXX need to retain key vars for merging but dont - see note.) */ create &DSOUT. from _MATRIX [colname={&VAR}]; append from _MATRIX; quit;  %mend;   /********************************************************************************* Test code. **********************************************************************************/ %MACRO _TEST;  proc sql; create table test_dsin (idstr varchar, x1 numeric, x2 numeric, x3 numeric); insert into test_dsin values ('foo1', 1, 2, 1); insert into test_dsin values ('foo2', 3, 5, 5); insert into test_dsin values ('foo3',  6, 2, 2);  create table test_rowctrlds (rt numeric); insert into test_rowctrlds values (5); insert into test_rowctrlds values (15); insert into test_rowctrlds values (8);  create table test_colctrlds (ct1 numeric, ct2 numeric, ct3 numeric); insert into test_colctrlds values (11, 9, 8); quit;  %_RAKE2WAYS(DSIN=test_dsin, DSOUT=test_dsout, VAR=x1 x2 x3, ROWCTRLDS=test_rowctrlds, ROWCTRLVAR=rt, COLCTRLDS=test_colctrlds, COLCTRLVAR=ct1 ct2 ct3);  %MEND;

pap · Posted 10-26-2014 10:59 AM

Hi Xia - I am new to IML and need some help with the coding. I am trying. Thanks. -Pappe

Ksharp · Posted 10-26-2014 11:22 AM

Hi. pap

Actually I am also a rookie of IML . If I have some time, I will review it .But now Rick could help you , once he saw the code you posted .

pap · Posted 10-26-2014 11:24 AM

ok. Thanks Xia.

pap · Posted 10-26-2014 11:55 AM

Attached is an Excel solution for the 3D problem (based on an example). I would use Excel if I did not have to solve this problem for 50 three-way tables (50 US states) and the levels of the variables being large and different across states. I think a SAS macro using IML is the best tool for this.

Thanks. -Pappe

Rick_SAS · Posted 10-27-2014 11:21 AM

Maybe I am confused, but I thought your problem was that you want to construct a 3D table that has given 2D marginals? It looks like the Excel spreadsheet starts with a 3D table (the red numbers) and then applies IPF in order to adjust the frequencies to fit the 2D marginal distributions. Thus the spreadsheet merely demonstrates the IPF algorithm. The spreadsheet doesn't demonstrate how to oobtain the red numbers in the first place.

Similarly, the Hunsinger PPT and macro that you link to has the statement "you will need a three-dimensional (e.g. Age by Sex by

Race) seed." The "seed" means that you have to START with a 3D matrix. The IPF algoithm then adjusts the matrix.

In short, I still don't know how to create a 3D frequency table that has given 2D marginals.

pap · Posted 10-27-2014 11:44 AM

Hi Rick - In the Excel spreadsheet, the starting 3D table is merely a seed. It can be populated with 1s and 0s (where marginal sum to 0). I actually built the solution in Excel. A bit tedious, but I had to. Thanks again to you and Xia for spending your valuable time to help me. I really appreciate it.

Regards,

Pappe

Rick_SAS · Posted 10-27-2014 03:28 PM

Ah! Thanks for pointing that out. I did not notice that before.

I still don't understand how it all works, but I translated your Excel formulas into SAS/IML and reproduced your computations. The program is below. Maybe it will be useful for you.

Notice that I transposed the definition of the column marginal (blue) and I interlaced the even and odd columns of your 3D matrices because I think that is what the IPF and MARG calls expect. However, I am not an expert on using those calls, so I might have made some mistakes. I am also not sure how to properly update the 3D matrix. I used the REPEAT and SHAPE functions to get the dimensions to work out, but the code might break if you change from this 3x2x4 example.

Anyway, HTH.

proc iml;
/* 3 x 2 x 4 example */
dim={3 2 4 };
A = {1 2 1,
     5 4 2,
     3 5 5,
     5 5 5,
     6 2 2,
     3 8 7,
     1 7 2,
     2 7 6};

start RowMarginal(dim, A);
   call marg(locmar, m, dim, A, {2,3});
   return( shape(m,dim[3],dim[2]) );
finish;
start ColMarginal(dim, A);
   call marg(locmar, m, dim, A, {1,2});
   return( shape(m,dim[2],dim[1]) );
finish;
start SliceMarginal(dim, A);
   call marg(locmar, m, dim, A, {1,3});
   return( shape(m,dim[3],dim[1]) );
finish;

/* define target marginals */
RowTarg = {9 11,
           17 13,
           19 16,
           7 8};
ColTarg = {22 20 10,
           13 20 15};
SliceTarg = {7 9 4,
             8 12 10,
            15 12 8,
             5 7 3};

B = A;
do iter = 1 to 4;
   print iter;
   /* row adjustment */
   RowFactor = RowTarg / RowMarginal(dim, B);
   B = B # shape(RowFactor, dim[2]*dim[3]);
   print B[F=5.2];
   /* col adjustment */
   ColFactor = ColTarg / ColMarginal(dim, B);
   B = B # shape(repeat(ColFactor, dim[3]), 0, dim[1]);
   print B[F=5.2];
   /* slice adjustment */
   SliceFactor = SliceTarg / SliceMarginal(dim, B);
   B = B # shape(repeat(SliceFactor, 1, dim[2]), 0, dim[1]);
   print B[F=5.2];
end;

pap · Posted 10-27-2014 03:54 PM

Hi Rick - This is great! I will adapt your code to my problem. Thank you so much for your help and for teaching us to program in IML elegantly.

Regards,

-Pappe

Rick_SAS · Posted 10-28-2014 06:24 AM

You are welcome! Sorry this took so long. I had wrongly assumed that you wanted a 3D frequency matrix that consisted of integers. I did not think to ask whether a decimal matrix would be sufficient. That greatly simplifies the problem.

pap · Posted 10-28-2014 09:05 AM

Thanks Rick - I am doing the rounding afterwards. Not optimal, but I took too much of time already. Thanks again. -Pappe

Re: SAS IML: HOW TO COMBINE 2-WAY MARGINAL TABLES INTO A 3-WAY TABLE?

Re: SAS IML: HOW TO COMBINE 2-WAY MARGINAL TABLES INTO A 3-WAY TABLE?

Re: SAS IML: HOW TO COMBINE 2-WAY MARGINAL TABLES INTO A 3-WAY TABLE?

Re: SAS IML: HOW TO COMBINE 2-WAY MARGINAL TABLES INTO A 3-WAY TABLE?

Re: SAS IML: HOW TO COMBINE 2-WAY MARGINAL TABLES INTO A 3-WAY TABLE?

Re: SAS IML: HOW TO COMBINE 2-WAY MARGINAL TABLES INTO A 3-WAY TABLE?

Re: SAS IML: HOW TO COMBINE 2-WAY MARGINAL TABLES INTO A 3-WAY TABLE?

Re: SAS IML: HOW TO COMBINE 2-WAY MARGINAL TABLES INTO A 3-WAY TABLE?

Re: SAS IML: HOW TO COMBINE 2-WAY MARGINAL TABLES INTO A 3-WAY TABLE?

Re: SAS IML: HOW TO COMBINE 2-WAY MARGINAL TABLES INTO A 3-WAY TABLE?

Re: SAS IML: HOW TO COMBINE 2-WAY MARGINAL TABLES INTO A 3-WAY TABLE?

Ready to join fellow brilliant minds for the SAS Hackathon?