BookmarkSubscribeRSS Feed
Calcite | Level 5

I have a specific data and now I need to know how I can produce a matrix with the following formula:

1/2+sum of multiplications of row(i)-row(n). That is to say, I have a matrix with let say 2 columns and 3 rows and I need to produce a new matrix by calculating a value for each element (value in the cell=v(row,column). In the real matrix the number of rows and columns are 106 and 133...

First the original matrix

          column1     column2

row1     v(1,1)          v(1,2)

row2     v(2,1)          v(2,2)

row3     v(3,1)          v(3,2)

And then the desired one. Read carefully, v(row, column)'s come from the upper matrix Smiley Happy:

          column1                                                                         column2

row1    1/2+v(1,1)+v(1,1)*v(2,1)+v(1,1)*v(2,1)*v(3,1)              1/2+v(1,2)+v(1,2)*v(2,2)+v(1,2)*v(2,2)*v(3,2)

row2    1/2+v(2,1)+v(2,1)*v(3,1)                                              1/2+v(2,2)+v(2,2)*v(3,2)

row3    1/2+v(3,1)                                                                    1/2+v(3,2)


Let A be the desired matrix without the 1/2 term. Notice that formula for first column of A is




and similar for other columns. Therefore you can compute A by

computing from the bottom row to the top. The i_th row of A is the i_th row of V times (1 + i_th row of A),

as follows. (Then add the 1/2 at the end.)

proc iml;

V = {1 2,

     3 4,

     5 6};

N = nrow(V);

A = j(N, ncol(V), .); /* allocate */

A[N, ] = V[N, ];      /* assign last row */

do i = N-1 to 1 by -1;

   A[i, ] = V[i, ]#(1 + A[i+1, ]);


A = 1/2 + A;

Calcite | Level 5

Thank you so much Rick! Now the problem seems to be that I have some missing values and those should be skipped. So basically if there is a missing value, then each term should be multiplied just like before but skip the missing terms. I was thinking whether I could just replace the missing values with 1 or 0 and then do the proc iml, but then the formula is carried out also for the terms that used to be missing.


Dealing with missing values can be compliated.  Mathematically, they can/should propogate, but you are intending to replace missing values with a nonmissing, which can be dangerous because the missing values are often there for a reason.  You are biasing your results if you arbitrarily replace missing values by 0 or 1. Statisticians tend to use some imputation technique instead of simple repacement.

If you insist on replacing the missing values, try some variation of this approach. Inside the loop, use these formulas:

   Q = choose(V[i, ]=., 0, V[i, ]);

   R = choose(V[i, ]=., 1, V[i, ]);

   A[i, ] = Q + R#A[i+1, ];

I don't know what you want to do if there is a missing value in the last row. Presumably use the 'Q' formula.

Calcite | Level 5

Hello again. I'm still working on my codes so there's a small change. Can you tell me what to change in the code to leave the LAST row as it is?


Why not save the last row in a vector, apply the transformation, and then restore the original values for the last row?



Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. 

Register now!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

From The DO Loop
Want more? Visit our blog for more articles like these.
Discussion stats
  • 5 replies
  • 2 in conversation