BookmarkSubscribeRSS Feed
Tcook
Calcite | Level 5

Hi,

I am trying to generate 300 random sample of a trivariate normal distribution with mean (-2,4,0) and variance co-variance matrix of ( 1 0.99 0.35, 0.99 1 0.80, 0.35 0.80 1 ).

I found two different codes online and tried using them but got an Error message that my matrix is not positive definite  hence, failure in generating the random sample. Please below are the two different codes I used.

Please I would appreciate if anyone could help sort this out.  Thanks in anticipation of a favorable and swift responses.

/////////////////////          CODE ONE              /////////////////////////////////

proc iml ;

  A = {1.0 0.99 0.35, 0.99 1.0 0.80, 0.35 0.80 1.0} ;

  call eigen(DIAG, P, A); 

print A P DIAG ;

  s1 = sqrt(DIAG[1]);  

  s2 = sqrt(DIAG[2]);

  s3 = sqrt(DIAG[3]);

  seed = 20000222 ;

  e1 = {1, 0, 0} ; e2 = {0, 1, 0} ; e3 = {0, 0, 1} ;

  V = j(300, 3, 0) ;

  do i = 1 to 300 ;

  y1 = -2 + s1 * rannor(seed) ;

  y2 = 4 + s2 * rannor(seed) ;  /*  y1 ~ N(-2, s1), ys ~ N(4, s2). and y3 ~ N(0, s3)    */

  y3 = s3 * rannor(seed) ;  /*  Note that the y's have covariance */

     V[i, 1] = y1 ;            /*  Create an n x 2 matrix of the y's */

     V[i, 2] = y2 ;

     v[i, 3] = y3 ;

  end ;

  W = V * P` ;       

  varnames = 'y1':'y2':'y3' ;

  create yobs  from V [colname = varnames] ;

  varnames = 'x1':'x2':'x3' ;                   

  create trans from W [colname = varnames] ;

  append from W ;

quit ;

run ;

/////////////// CODE TWO //////////////////////////////

/* Generate the multivariate normal data in SAS/IML */

/* non-macro version */

data work1r; /* data for the parameter for the multivariate normal data */

input r1 r2 r3 means vars;

cards;

1.0 0.99 0.35 -2 1

0.99 1.0 0.80 4 1

0.35 0.80 1.0 0 1

;

proc iml;

use work1;

read all var {k1 k2 k3} into R;

read all var {means} into mu;

read all var {vars} into sigma;

p = ncol(R);

diag_sig = diag( sigma );

DRD = diag_sig * R * diag_sig`;

U = half(DRD);

do i = 1 to 300;

z = rannor( j(p,1,1234));

y = mu + U` * z;

yprime = y`;

yall = yall // yprime;

end;

varnames = { y1 y2 y3 };

create work2 from yall (|colname = varnames|);

append from yall;

quit;

proc print data=work2;

run;

7 REPLIES 7
Rick_SAS
SAS Super FREQ

You can generate mv normal data in SAS/IML by using the RandNormal function. See Sampling from the multivariate normal distribution - The DO Loop

As you've discovered, you must have a valid covariance matrix in order to generate the data. The reason why your matrix is not a valid covariance matrix, and some reasons why this occurs, are discussed in this article: When is a correlation matrix not a correlation matrix? - The DO Loop

What to DO about it? That's hard.  If you computed the covariance by using pairwise correlations, go back and recompute it by using listwise deletion of missing values.  If the correlations are estimated and you don't have the original data, you can try shrinkage methods or projection methods to obtain a nearby matrix that is a valid correlation matrix.  For example, the nearest correlation matrix (in the Frobenius norm) to your matrix is approximately

A={ 1.0  0.9  0.4,

0.9  1.0  0.75,

0.4  0.75  1.0};

Tcook
Calcite | Level 5

Thanks for your response. It is quite help but I still have some questions:

If I understand correctly, if for me to generate a random sample using the correlation matrix  MUST also have the standard deviation?

In my case all I have is the correlation matrix and mean, please what sas code/procedure can I use in generation the nearest correlation matrix based on the Frobenius norm?

Thanks in anticipation of your response.

Rick_SAS
SAS Super FREQ

I'll try to find time to write a blog post on this topic in the next week.

Rick

Tcook
Calcite | Level 5

Hi Rick,

That will be appreciated.

However, I am seriously looking for a solution to my problem am working on a project and need to continue with the main aspect of the project which depends on this. Is there anyway you  could be of help with the correct code to generate the random sample given that I have only the MEAN and a Correlation matrix that is not positive definite?

Meanwhile I was able to generate an eigenvalue that is positive using R-Code I tried generating a random sampling using your code below (

proc iml;

load module=RndOrthog; /* load or define modules here */

start RndMatWithEigenval(lambda);

   n = ncol(lambda); /* assume lambda is row vector */

   Q = RndOrthog(n);

return( Q`*diag(lambda)*Q );

finish;

lambda = {2.4 0.6 0};

call randseed(1234);

S1 = RndMatWithEigenval(lambda); /* eigenvalues are lambda */

eval = eigval(S1);

print eval;

print S1; ) but I got an error message which has to do with the line

load module=RndOrthog; /* load or define modules here */. how do I sort this out?

Thanks in anticipation of favorable response.

C. Wood.

Rick_SAS
SAS Super FREQ

I've already given you a correlation matrix that is close to your estimate. If that correlation matrix is acceptable, just call the RANDNORMAL function:

N = 1000; /* number of obs to generate */

mean = {-2,4,0};

x = RandNormal(N, mean, A);

See the RandNormal doc for details.

Tcook
Calcite | Level 5

Hi Rick,

thanks the correlation is matrix is acceptable, it will be of great help if I could get the code you used in obtaining the correlation matrix.

As I need to document it as well.

Thanks.

C. Wood.

Rick_SAS
SAS Super FREQ

Here is a description of how to get a correlation matrix when your estimate is not positive definite:

Computing the nearest correlation matrix - The DO Loop

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 2950 views
  • 3 likes
  • 2 in conversation