How to generate a data set for given covariance structure

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 8
Accepted Solution

How to generate a data set for given covariance structure

Please help me with my problem. I used following Macro programme to generate a data set for given cov. or corr. structure for my research( I am a graduate student). These data coming from only NORMAL distribution. But I need data from several distribution like cauchy, chisquared, web, lognormal, exponential, etc..Please can anybody tell me how I develop this to get data set from other distribution.. Thank you so much.

%macro corr2data(outdata, corrmat, n , full='T', corr='T');

  proc iml;

    use &corrmat;

    read all var _num_  into C;

      rn = nrow(C);

      cn = ncol(C);

    if (cn = rn & %upcase(&full) ="F") then do;

        do i = 1 to rn;

        do j = i to cn;

           if i = j & C[i, j] = . then C[i,j] = 1;

if  i ^= j & C[i,j]=. then C[i,j]=C[j,i];

end;

end;

       end;

     

       if %upcase(&corr) = "F" then do; /*converting the covariance to correlation*/

do i = 1 to rn;

do j = 1 to cn;

if i ^=j then C[i,j] = C[i, j]/(sqrt(C[i,i])*sqrt(C[j,j]));

         end;

end;

        do i = 1 to rn;

C[i,i] = 1;

end;

       end;

    if (cn = rn & sum(abs(C-t(C))) =0 & min(eigval(C)) > 0 & max(abs(C)) <= 1)

    then do;

      p = root(C);

      dim = nrow(C);

myvar = RANNOR(J(&n, dim, 0));

        do i = 1 to dim;

myvar[, i] = myvar[,i]-(sum(myvar[,i])/&n);

end;

        XX = (t(myvar)*myvar)/(&n-1);

        U = root(inv(XX));

        Y = myvar*T(U);

        T = Y*p;

create &outdata from T;

      append from T;

      end;

      else print "Check your input matrix, it is not a correlation matrix nor a covariance matrix.";

  quit;

%mend;

data corr;

input x1 x2 x3 x4;

datalines;

1    .8    .5  .2

.8    1    .3  .4

.5    .3 1  .7

.2  .4 .7   1

;

%corr2data(mycorr, corr, 10 , FULL='T', CORR='T'); /*5=# SUBJECTS*/

*proc print data=mycorr;

run;


Accepted Solutions
Solution
‎03-21-2013 02:49 PM
SAS Super FREQ
Posts: 3,477

Re: How to generate a data set for given covariance structure

First, you can simplify your existing code by calling the RANDNORMAL function in SAS/IML, as shown in this article:

Sampling from the multivariate normal distribution - The DO Loop

Y = RandNormal(&n, j(1,ncol(C),0), C);

Not every distribution has a multivariate version.  SAS/IML provides the multinomial (discrete), multivariate t, and some others. See SAS/IML(R) 12.1 User's Guide

You can get a MV lognormal by generating X as a MV normal and then setting Y_i = exp(X_i).

The MV Cauchy is a ratio of normal and Gamma(1/2) variates, but I haven't heard of a "correlated Cauchy":

   /* Sample from a multivariate Cauchy distribution */

   start RandMVCauchy(N, p);

      z = j(N,p,0);  y = j(N,1);         /* allocate matrix and vector */

      call randgen(z, "Normal");

      call randgen(y, "Gamma", 0.5);     /* alpha=0.5, unit scale      */

      return( z / sqrt(2*y) );

   finish;

For the other distributions:

1) I am not familiar with the "web" or the MV chisquare distribution. .

2) There are several distributions that people call "multivariate exponential." You would need to provide a reference.

In general, generating MV distributions with given correlations is hard except in certain simple cases like the MV normal.  What some people do instead is to use copulas to generate MV data that has a given marginal distribution and a given correlation struucture.  See   The COPULA Procedure .

Another useful trick is to use mixtures of normals. For example, to simulate from a distribution that has longer tails than the normal, use a mixture of two normal, one with covariance S and the other with covariance kS, k>1.

View solution in original post


All Replies
Solution
‎03-21-2013 02:49 PM
SAS Super FREQ
Posts: 3,477

Re: How to generate a data set for given covariance structure

First, you can simplify your existing code by calling the RANDNORMAL function in SAS/IML, as shown in this article:

Sampling from the multivariate normal distribution - The DO Loop

Y = RandNormal(&n, j(1,ncol(C),0), C);

Not every distribution has a multivariate version.  SAS/IML provides the multinomial (discrete), multivariate t, and some others. See SAS/IML(R) 12.1 User's Guide

You can get a MV lognormal by generating X as a MV normal and then setting Y_i = exp(X_i).

The MV Cauchy is a ratio of normal and Gamma(1/2) variates, but I haven't heard of a "correlated Cauchy":

   /* Sample from a multivariate Cauchy distribution */

   start RandMVCauchy(N, p);

      z = j(N,p,0);  y = j(N,1);         /* allocate matrix and vector */

      call randgen(z, "Normal");

      call randgen(y, "Gamma", 0.5);     /* alpha=0.5, unit scale      */

      return( z / sqrt(2*y) );

   finish;

For the other distributions:

1) I am not familiar with the "web" or the MV chisquare distribution. .

2) There are several distributions that people call "multivariate exponential." You would need to provide a reference.

In general, generating MV distributions with given correlations is hard except in certain simple cases like the MV normal.  What some people do instead is to use copulas to generate MV data that has a given marginal distribution and a given correlation struucture.  See   The COPULA Procedure .

Another useful trick is to use mixtures of normals. For example, to simulate from a distribution that has longer tails than the normal, use a mixture of two normal, one with covariance S and the other with covariance kS, k>1.

Occasional Contributor
Posts: 8

Re: How to generate a data set for given covariance structure

Dear Dr.Wicklin, THANK YOU SO MUCH for your help.You did great help. I am working with repeated model. I think your new book talk about everything What I need. I saw a comment of you that You talk about repeated data in two chapter of that book. I already ordered it. Until it come to me(end of next month?) I try to do some work of my research. I am sorry for my spelling mistake. i mean web=weibull Distribution. Thank for your help Sir... 

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 2 replies
  • 998 views
  • 0 likes
  • 2 in conversation