COV matrix incomplete

Accepted Solution Solved
Reply
Occasional Contributor PBG
Occasional Contributor
Posts: 13
Accepted Solution

COV matrix incomplete

[ Edited ]

I'm brand new to SAS, trying to use it to generate random values for four, normally-distributed, collinear variables. A colleague of mine has prepared the following code, but it throws "ERROR: COV matrix is incomplete in data set."

 

Data RandomValueGeneratorStatistics (type=COV) ; 
input _TYPE_ $ _NAME_ $ variable1 variable2 variable3 variable4; 
datalines ; 
COV variable1  0.739191357 0.276109171 0.100056621 470.1092606
COV variable2  0.876109171 0.327432304 0.648272489 0.611948925
COV variable3  0.900056621 0.848272489 0.163812314 0.558222994
COV variable4  470.1092606 0.611948925 0.358222994 5474224.269
MEAN           4.217123172 41.69388108 4.893316488 8606.147733
;
run; 

Proc Simnorm data=RandomValueGeneratorStatistics outsim=ssim 
numreal = 1000 
seed = 54321 ; 
var variable1 variable2 variable3 variable4 ; 
run;

 

Using version 6.100.0.2870. Any ideas as to why? Please share. Thanks for your help!


Accepted Solutions
Solution
‎06-20-2017 03:19 PM
Contributor
Posts: 22

Re: COV matrix incomplete

This is the code I have based on what you've posted so far:

 

Data RandomValueGeneratorStatistics (type=COV) ; 
input _TYPE_ $ _NAME_ $9. variable1 variable2 variable3 variable4; 
datalines ; 
COV variable1 0.999368288 0.009398213 0.075083647 0.453177098
COV variable2 0.009398213 0.000155024 0.000878636 0.006372190
COV variable3 0.075083647 0.000878636 0.008894056 0.043200581
COV variable4 0.453177098 0.006372190 0.043200581 0.999368288
MEAN          3.217123172 42.69388108 3.893316488 8605.147733
;
run; 

Proc Simnorm data=RandomValueGeneratorStatistics outsim=ssim 
numreal = 1000 
seed = 54321 ; 
var variable1 variable2 variable3 variable4 ; 
run;

... And some of the results:

 

 Obs    variable1    variable2    variable3    variable4    Rnum

1     3.75650      42.7002      3.99283      8607.08       1
2     3.74764      42.7058      3.86268      8606.20       2
3     2.67129      42.6923      3.86041      8603.92       3
4     1.81036      42.6811      3.78579      8604.83       4
5     3.06134      42.6897      3.89000      8604.87       5
6     3.08770      42.6886      3.92196      8604.73       6
7     4.44447      42.6934      3.87695      8605.65       7
8     2.72725      42.6790      3.90519      8604.83       8
9     3.60691      42.6917      3.88077      8604.06       9
10     4.04193      42.7055      3.95174      8604.69      10
11     2.17955      42.6804      3.80562      8604.45      11
12     3.15123      42.6909      3.94730      8605.94      12

Is this what you are looking for?

View solution in original post


All Replies
Occasional Contributor PBG
Occasional Contributor
Posts: 13

Re: COV matrix incomplete

I should say EG 6.

 

Is it possible that the variable names as specified are too long, and so causing the matrix to be read-in incorrectly?

Contributor
Posts: 22

Re: COV matrix incomplete

[ Edited ]

Hello.

 

So there are a few things going on here.  

The first thing I noticed is that after running a Proc Print, your _Name_ variable is being truncated as you can see below.

 

Obs    _TYPE_     _NAME_     variable1    variable2    variable3     variable4

1      COV      variable       0.739      0.27611      0.10006         470.11
2      COV      variable       0.876      0.32743      0.64827           0.61
3      COV      variable       0.900      0.84827      0.16381           0.56
4      COV      variable     470.109      0.61195      0.35822     5474224.27

 

To fix this issue, specify a length for the variable ($10. for example)

  

Data RandomValueGeneratorStatistics (type=COV) ; 
input _TYPE_ $ _NAME_ $10. variable1 variable2 variable3 variable4; 
datalines ; 
COV variable1  0.739191357 0.276109171 0.100056621 470.1092606
COV variable2  0.876109171 0.327432304 0.648272489 0.611948925
COV variable3  0.900056621 0.848272489 0.163812314 0.558222994
COV variable4  470.1092606 0.611948925 0.358222994 5474224.269
MEAN           4.217123172 41.69388108 4.893316488 8606.147733
;
run;

 

After making this change and running a Proc Print the output now looks like this:

 

Obs    _TYPE_     _NAME_      variable1    variable2    variable3     variable4

1      COV      variable1       0.739       0.2761      0.10006         470.11
2      COV      variable2       0.876       0.3274      0.64827           0.61
3      COV      variable3       0.900       0.8483      0.16381           0.56
4      COV      variable4     470.109       0.6119      0.35822     5474224.27
5      MEAN                     4.217      41.6939      4.89332        8606.15

 

Making this change will fix the error you are currently getting but running the SIMNORM procedure will still produce other errors.  I'm not that familiar with the SIMNORM procedure but I believe that you need to make sure that your matrix is both symmetric and positive definite (Your current input matrix does not satisfy these parameters).

 

I hope this at least helps you get going in the right direction

Occasional Contributor PBG
Occasional Contributor
Posts: 13

Re: COV matrix incomplete

Thanks for your help @jdwaterman91.

 

I adjusted the variable length parameter and made sure the matrix I'm actually using is symmetric (I had mistyped some entries).

 

What is meant by 'positive-definite'? Is this a reason, that my matrix is not such, I mean, that I still throw "ERROR: COV matrix is incomplete in data set?" Thanks for your help.

Contributor
Posts: 22

Re: COV matrix incomplete

A matrix is positive-definite if all of its eigenvalues are positive.

 

The SAS Log would write the following error message if your matrix was not positive-definite.

 

ERROR: Invalid covariance or conditional covariance matrix; matrix is not positive definite.

As far as making sure that your matrix satisfies these parameters for your purposes, there are experts in this community that are far more experienced than I and would probably do a lot better of a job explaining what exactly you need to do/possibly provide some alternative methods for achieving your result. I will defer to them.

 

Occasional Contributor PBG
Occasional Contributor
Posts: 13

Re: COV matrix incomplete

[ Edited ]

Appreciate your help very much, thanks @jdwaterman91.

 

Am now using this matrix:

COV variable1 0.999368288 0.009398213 0.075083647 0.453177098
COV variable2 0.009398213 0.000155024 0.000878636 0.006372190
COV variable3 0.075083647 0.000878636 0.008894056 0.043200581
COV variable4 0.453177098 0.006372190 0.043200581 0.999368288
MEAN          3.217123172 42.69388108 3.893316488 8605.147733

receive 'ERROR: COV matrix incomplete in data set'.

 

Is it possible that certain of the matrix items are so small that they are interpreted incorrectly, (e.g. 0.000155024)?

Contributor
Posts: 22

Re: COV matrix incomplete

Due to your spacing try setting the length of the Name Variable = to $9. instead of $10. 

 

Occasional Contributor PBG
Occasional Contributor
Posts: 13

Re: COV matrix incomplete

I did do so..no luck unfortunately.

Solution
‎06-20-2017 03:19 PM
Contributor
Posts: 22

Re: COV matrix incomplete

This is the code I have based on what you've posted so far:

 

Data RandomValueGeneratorStatistics (type=COV) ; 
input _TYPE_ $ _NAME_ $9. variable1 variable2 variable3 variable4; 
datalines ; 
COV variable1 0.999368288 0.009398213 0.075083647 0.453177098
COV variable2 0.009398213 0.000155024 0.000878636 0.006372190
COV variable3 0.075083647 0.000878636 0.008894056 0.043200581
COV variable4 0.453177098 0.006372190 0.043200581 0.999368288
MEAN          3.217123172 42.69388108 3.893316488 8605.147733
;
run; 

Proc Simnorm data=RandomValueGeneratorStatistics outsim=ssim 
numreal = 1000 
seed = 54321 ; 
var variable1 variable2 variable3 variable4 ; 
run;

... And some of the results:

 

 Obs    variable1    variable2    variable3    variable4    Rnum

1     3.75650      42.7002      3.99283      8607.08       1
2     3.74764      42.7058      3.86268      8606.20       2
3     2.67129      42.6923      3.86041      8603.92       3
4     1.81036      42.6811      3.78579      8604.83       4
5     3.06134      42.6897      3.89000      8604.87       5
6     3.08770      42.6886      3.92196      8604.73       6
7     4.44447      42.6934      3.87695      8605.65       7
8     2.72725      42.6790      3.90519      8604.83       8
9     3.60691      42.6917      3.88077      8604.06       9
10     4.04193      42.7055      3.95174      8604.69      10
11     2.17955      42.6804      3.80562      8604.45      11
12     3.15123      42.6909      3.94730      8605.94      12

Is this what you are looking for?

Occasional Contributor PBG
Occasional Contributor
Posts: 13

Re: COV matrix incomplete

Precisely the code I have! I'm not sure why it's not working for me the same. That's exactly the result I'm looking for!

Contributor
Posts: 22

Re: COV matrix incomplete

What error message is the log giving you?

Occasional Contributor PBG
Occasional Contributor
Posts: 13

Re: COV matrix incomplete

Same as before: "ERROR: COV matrix incomplete in data set..."

 

My variable names are actually written with capital 'V'. Could that be a problem? I'll just copy-paste the code you kindly provided before and try..

Occasional Contributor PBG
Occasional Contributor
Posts: 13

Re: COV matrix incomplete

Works like a charm!

 

Thank you very much for your help through this @jdwaterman91.

Grand Advisor
Posts: 9,584

Re: COV matrix incomplete

COV matrix should be symmetric . And why not use IML ?  

Data RandomValueGeneratorStatistics (type=COV) ; 
input _TYPE_ $ _NAME_ $ variable1 variable2 variable3 variable4; 
datalines ; 
COV variable1  0.739191357 0.276109171 0.100056621 470.1092606
COV variable2  0.876109171 0.327432304 0.648272489 0.611948925
COV variable3  0.900056621 0.848272489 0.163812314 0.558222994
COV variable4  470.1092606 0.611948925 0.358222994 5474224.269
MEAN    .       4.217123172 41.69388108 4.893316488 8606.147733
;
run; 
proc iml;
use RandomValueGeneratorStatistics;
read all var{variable1 variable2 variable3 variable4} where(_TYPE_='MEAN') into mean;
read all var{variable1 variable2 variable3 variable4} where(_TYPE_='COV') into cov;
close;

n=1000;
call randseed(123456789);
x=randnormal(n,mean,cov);

create want from x;
append from x;
close;
quit;

proc corr data=want cov;
run;   
Occasional Contributor PBG
Occasional Contributor
Posts: 13

Re: COV matrix incomplete

Thanks for your help @Ksharp.

 

The matrix I'm actually using is in fact symmetric now. I don't have an IML license unfortunately. Any more ideas? Thanks for your help.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 14 replies
  • 252 views
  • 14 likes
  • 3 in conversation