BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Piers
Obsidian | Level 7

Hi.

 

I would really appreciate some help with the SAS code I attach. What I want to do is to create a dataset consistent with the Type = COV structure, to be read into PROC SIMNORM. I attach the code used to create the dataset. However, I get the following error message when I ask PROC SIMNORM to read the dataset scov :

 

ERROR: Invalid covariance or conditional covariance matrix; matrix is not positive definite.
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.SSIM may be incomplete. When this step was stopped there were 1900
observations and 5 variables.
WARNING: Data set WORK.SSIM was not replaced because this step was stopped.

 

And I do not understand what I have done wrong.

 

Any help would be deeply appreciated

 

Best

 

Piers C

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

You're welcome.

 

I think the Wikipedia articles Covariance matrixDefinite symmetric matrix and Determinant contain more than enough information for starters.

 

Actually I computed that one determinant by hand. The SAS tool for (vector and) matrix operations is SAS/IML, but I don't have a license for it. As part of Base SAS there are some CALL routines for matrix operations available in PROC FCMP, in particular CALL DET to compute determinants, but I haven't really started using those special functions and CALL routines (and using them is less convenient than using ordinary SAS functions and CALL routines). Moreover, there are some relevant methods available in the DS2 procedure (as I've just discovered!), in particular the DET method, which also computes determinants.

 

For your 3x3 matrices a manual implementation of the Rule of Sarrus is sufficient:

data det(keep=mtx d:);
array m[3,3];
do i=1 to dim1(m);
  set scov(where=(_name_ ne ' '));
  m[i,1]=a;
  m[i,2]=b;
  m[i,3]=c;
end;
d1=m[1,1];
d2=m[1,1]*m[2,2]-m[1,2]*m[2,1];
d3= m[1,1]*m[2,2]*m[3,3]+m[1,2]*m[2,3]*m[3,1]+m[1,3]*m[3,2]*m[2,1]
   -m[1,3]*m[2,2]*m[3,1]-m[2,3]*m[3,2]*m[1,1]-m[3,3]*m[2,1]*m[1,2];
run;

The DATA step above computes not only the determinants of your 125 matrices (variable d3), but also the first and second leading principal minors (variables d1 and d2). If and only if all three are positive and the matrix is symmetric, the matrix is positive definite. The DATA step below selects matrices which cannot be covariance matrices:

data nocovmat;
set det;
if .<min(of d:)<0;
run;

(You constructed the matrices to be symmetric. Of course, this could also be checked in the first DATA step:

sym=(m[1,2]=m[2,1] & m[1,3]=m[3,1] & m[2,3]=m[3,2]).)

 

There's no exactly analogous criterion for positive semidefinite matrices, though: All leading principal minors being non-negative does not imply that a symmetric matrix is positive semidefinite. But for non-degenerate multivariate normal distributions you need positive definite covariance matrices anyway.

View solution in original post

4 REPLIES 4
FreelanceReinh
Jade | Level 19

Hi @Piers,

 

Covariance matrices are always positive semidefinite, but 21 of your 125 matrices are not. The first example is mtx=20: The determinant of this matrix is −0.184 < 0, hence it's not a covariance matrix. Therefore SAS errors out after the first 19 matrices (i.e., after creating 19*100=1900 observations in dataset WORK.SSIM).

Piers
Obsidian | Level 7

Many thanks for this. Its enormously helpful.

 

I am not a mathematician, and therefore need to understand more about matrix determinants. So, if you have time:

 

1) Could you point me in the direction of a sensible reference which lays out what the important issues are

2) Do you have SAS code which allowed you to calculate the determinant from the data structure I have, please?

 

Again, many thanks

 

Piers

FreelanceReinh
Jade | Level 19

You're welcome.

 

I think the Wikipedia articles Covariance matrixDefinite symmetric matrix and Determinant contain more than enough information for starters.

 

Actually I computed that one determinant by hand. The SAS tool for (vector and) matrix operations is SAS/IML, but I don't have a license for it. As part of Base SAS there are some CALL routines for matrix operations available in PROC FCMP, in particular CALL DET to compute determinants, but I haven't really started using those special functions and CALL routines (and using them is less convenient than using ordinary SAS functions and CALL routines). Moreover, there are some relevant methods available in the DS2 procedure (as I've just discovered!), in particular the DET method, which also computes determinants.

 

For your 3x3 matrices a manual implementation of the Rule of Sarrus is sufficient:

data det(keep=mtx d:);
array m[3,3];
do i=1 to dim1(m);
  set scov(where=(_name_ ne ' '));
  m[i,1]=a;
  m[i,2]=b;
  m[i,3]=c;
end;
d1=m[1,1];
d2=m[1,1]*m[2,2]-m[1,2]*m[2,1];
d3= m[1,1]*m[2,2]*m[3,3]+m[1,2]*m[2,3]*m[3,1]+m[1,3]*m[3,2]*m[2,1]
   -m[1,3]*m[2,2]*m[3,1]-m[2,3]*m[3,2]*m[1,1]-m[3,3]*m[2,1]*m[1,2];
run;

The DATA step above computes not only the determinants of your 125 matrices (variable d3), but also the first and second leading principal minors (variables d1 and d2). If and only if all three are positive and the matrix is symmetric, the matrix is positive definite. The DATA step below selects matrices which cannot be covariance matrices:

data nocovmat;
set det;
if .<min(of d:)<0;
run;

(You constructed the matrices to be symmetric. Of course, this could also be checked in the first DATA step:

sym=(m[1,2]=m[2,1] & m[1,3]=m[3,1] & m[2,3]=m[3,2]).)

 

There's no exactly analogous criterion for positive semidefinite matrices, though: All leading principal minors being non-negative does not imply that a symmetric matrix is positive semidefinite. But for non-degenerate multivariate normal distributions you need positive definite covariance matrices anyway.

Piers
Obsidian | Level 7
Wow - very many thanks. The reason for doing this is to try and discover under what covariance conditions the regression model A = B + C + B*C gives a potentially significant interaction terms even though A, B, and C are derived from multivariate normal distributions. You have really been astonishingly helpful

Best

Piers

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 667 views
  • 3 likes
  • 2 in conversation