BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
bernhard
Calcite | Level 5

I am trying to vectorize the code below.

Can somebody help me to avoid the two do loops?

You can ignore the code before the do loops, that is simply to generate the data.

proc iml;
/*Generate Data*/
data=J(5,9,0);
call randseed(1);
call randgen(data, "Normal");

dim=nrow(data);
obs=ncol(data);
weights=J(obs,obs,0);
call randgen(weights, "Normal");

mat=j(dim,dim,0);
/*I would like to vectorize this loop*/
do j=1 to obs;
  do i=1 to obs;
  x=data[,i]-data[,j];
  mat=mat+weights[i,j]*x*t(x);
  end; 
end;
quit;

Message was edited by: bernhard  I (rather Rick Wicklin actually) noticed an error in the code. In fact I need to go through the full inner loop up to obs.

1 ACCEPTED SOLUTION

Accepted Solutions
Rick_SAS
SAS Super FREQ

Sorry that I didn't see that you revised your original code.

Yes, Ian's method will work, although it is slightly more efficient to use the # operator to avoid forming the diagonal matrix:

do j=1 to p;

   x=data-data[,j];

   mat=mat+x*(weights[,j]#t(x));

end;

(For an explanation, see Converting between correlation and covariance matrices - The DO Loop )

If you determine that the weights are positive, the you can use

do j=1 to p;

   x=(data-data[,j])#sqrt(t(weights[,j]));

   mat=mat + x*x`;

end;

View solution in original post

8 REPLIES 8
Rick_SAS
SAS Super FREQ

1) What are you trying to accomplish statistically? For example, is this a weighted OLS regression or some other standard statistical analysis?

2) You have obs=ncol(data) and dim=nrow(data).  Usually the rows of the data matrix contain the observations and the columns contain the variables, but you seem to have reversed that convention.

3) The program contains the comment "of course the loop could go to obs." I think vectorizing will be easier if the inner loop is complete, but when I do that (and remove the line mat=mat+t(mat)), I don't get the same answer.  Can you write the complete inner loop?

4) For vectorization, there are basic principles that often help. See "How to vectorize computations in a matrix language"  If you answer 1-3, I think it will be easier to apply these prinicples.

bernhard
Calcite | Level 5

Hi Rick

Thank you. I have read several very interesting posts on your Blog.

1. I am trying to calculate the gradient of a more complex function. I do not want to post it here though. Maybe I simply need to do some more thinking to see the simplification.

2. You are right about that. This has been so form the beginning in my code. I unfortunately started with a transposed matrix. Although it is only a notation. I do not want to change all my code.

3. In fact I had an error there. I need to do the full loop.

4. I have seen that post, thank you.

I realize that the above answers will probably not help you to answer my original question. I need to do some more thinking or maybe post the full function which I want to differentiate. Although I believe it is possible that my program cannot be simplified.

Rick_SAS
SAS Super FREQ

You can get rid of the inner loop. I haven't figured out whether the outer loop can be simplified. The WEIGHTS matrix and the data[,i]-data[,j] computation make each computation dependent on a loop variable.

So is this is some sort of finite difference computation, and that's why we have data[,i]-data[,j]?

Can we assume that the WEIGHTS matrix is positive? Your example fills it with random normal values, but that isn't usually the case for a matrix of that name.

bernhard
Calcite | Level 5

Can you please tell me how to avoid the inner loop?

I am investigating a Mahalanobis distance and try to minimize an estimation error depending on the input matrix of the Mahalanobis matrix.

The weights matrix is not positive.

Rick_SAS
SAS Super FREQ

Sure, I was waiting for you to post the correct code with the full inner loop (#3 above).

If you are computing a Mahalanibos-type matrix, you might be interested in this article: How to compute Mahalanobis distance in SAS - The DO Loop It presents several tricks for efficiently computing weighted distance matrices.

bernhard
Calcite | Level 5

I have edited the original post (i.e. the code section) a few days ago. Here is the loop


do j=1 to obs;

  do i=1 to obs;

  x=data[,i]-data[,j];

  mat=mat+weights[i,j]*x*t(x);

  end; 

end;


Thanks for the link I will have a look at that post.

IanWakeling
Barite | Level 11

I think you can get rid of the inner loop by doing this:

  do j = 1 to obs;

    x = data - data[, j];

    mat = mat + x * diag(weights[ ,j]) * t(x);

  end;

But I can't see any way to avoid the outer loop.

Rick_SAS
SAS Super FREQ

Sorry that I didn't see that you revised your original code.

Yes, Ian's method will work, although it is slightly more efficient to use the # operator to avoid forming the diagonal matrix:

do j=1 to p;

   x=data-data[,j];

   mat=mat+x*(weights[,j]#t(x));

end;

(For an explanation, see Converting between correlation and covariance matrices - The DO Loop )

If you determine that the weights are positive, the you can use

do j=1 to p;

   x=(data-data[,j])#sqrt(t(weights[,j]));

   mat=mat + x*x`;

end;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

From The DO Loop
Want more? Visit our blog for more articles like these.
Discussion stats
  • 8 replies
  • 2206 views
  • 3 likes
  • 3 in conversation