turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- General Programming
- /
- How to generate a data set for given covariance st...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-20-2013 10:46 AM

Please help me with my problem. I used following Macro programme to generate a data set for given cov. or corr. structure for my research( I am a graduate student). These data coming from only NORMAL distribution. But I need data from several distribution like cauchy, chisquared, web, lognormal, exponential, etc..Please can anybody tell me how I develop this to get data set from other distribution.. Thank you so much.

**%macro** corr2data(outdata, corrmat, n , full='T', corr='T');

proc iml;

use &corrmat;

read all var _num_ into C;

rn = nrow(C);

cn = ncol(C);

if (cn = rn & %upcase(&full) ="F") then do;

do i = **1** to rn;

do j = i to cn;

if i = j & C[i, j] = **.** then C[i,j] = **1**;

if i ^= j & C[i,j]=**.** then C[i,j]=C[j,i];

end;

end;

end;

if %upcase(&corr) = "F" then do; /*converting the covariance to correlation*/

do i = **1** to rn;

do j = **1** to cn;

if i ^=j then C[i,j] = C[i, j]/(sqrt(C[i,i])*sqrt(C[j,j]));

end;

end;

do i = **1** to rn;

C[i,i] = **1**;

end;

end;

if (cn = rn & sum(abs(C-t(C))) =**0** & min(eigval(C)) > **0** & max(abs(C)) <= **1**)

then do;

p = root(C);

dim = nrow(C);

myvar = **RANNOR**(J(&n, dim, **0**));

do i = **1** to dim;

myvar[, i] = myvar[,i]-(sum(myvar[,i])/&n);

end;

XX = (t(myvar)*myvar)/(&n-**1**);

U = root(inv(XX));

Y = myvar*T(U);

T = Y*p;

create &outdata from T;

append from T;

end;

else print "Check your input matrix, it is not a correlation matrix nor a covariance matrix.";

quit;

**%mend**;

**data** corr;

input x1 x2 x3 x4;

datalines;

1 .8 .5 .2

.8 1 .3 .4

.5 .3 1 .7

.2 .4 .7 1

;

%** corr2data**(mycorr, corr,

*proc print data=mycorr;

**run**;

Accepted Solutions

Solution

03-21-2013
02:49 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-21-2013 02:49 PM

First, you can simplify your existing code by calling the RANDNORMAL function in SAS/IML, as shown in this article:

Sampling from the multivariate normal distribution - The DO Loop

Y = RandNormal(&n, j(1,ncol(C),0), C);

Not every distribution has a multivariate version. SAS/IML provides the multinomial (discrete), multivariate t, and some others. See SAS/IML(R) 12.1 User's Guide

You can get a MV lognormal by generating X as a MV normal and then setting Y_i = exp(X_i).

The MV Cauchy is a ratio of normal and Gamma(1/2) variates, but I haven't heard of a "correlated Cauchy":

/* Sample from a multivariate Cauchy distribution */

start RandMVCauchy(N, p);

z = j(N,p,0); y = j(N,1); /* allocate matrix and vector */

call randgen(z, "Normal");

call randgen(y, "Gamma", 0.5); /* alpha=0.5, unit scale */

return( z / sqrt(2*y) );

finish;

For the other distributions:

1) I am not familiar with the "web" or the MV chisquare distribution. .

2) There are several distributions that people call "multivariate exponential." You would need to provide a reference.

In general, generating MV distributions with given correlations is hard except in certain simple cases like the MV normal. What some people do instead is to use copulas to generate MV data that has a given marginal distribution and a given correlation struucture. See The COPULA Procedure .

Another useful trick is to use mixtures of normals. For example, to simulate from a distribution that has longer tails than the normal, use a mixture of two normal, one with covariance S and the other with covariance kS, k>1.

All Replies

Solution

03-21-2013
02:49 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-21-2013 02:49 PM

First, you can simplify your existing code by calling the RANDNORMAL function in SAS/IML, as shown in this article:

Sampling from the multivariate normal distribution - The DO Loop

Y = RandNormal(&n, j(1,ncol(C),0), C);

Not every distribution has a multivariate version. SAS/IML provides the multinomial (discrete), multivariate t, and some others. See SAS/IML(R) 12.1 User's Guide

You can get a MV lognormal by generating X as a MV normal and then setting Y_i = exp(X_i).

The MV Cauchy is a ratio of normal and Gamma(1/2) variates, but I haven't heard of a "correlated Cauchy":

/* Sample from a multivariate Cauchy distribution */

start RandMVCauchy(N, p);

z = j(N,p,0); y = j(N,1); /* allocate matrix and vector */

call randgen(z, "Normal");

call randgen(y, "Gamma", 0.5); /* alpha=0.5, unit scale */

return( z / sqrt(2*y) );

finish;

For the other distributions:

1) I am not familiar with the "web" or the MV chisquare distribution. .

2) There are several distributions that people call "multivariate exponential." You would need to provide a reference.

In general, generating MV distributions with given correlations is hard except in certain simple cases like the MV normal. What some people do instead is to use copulas to generate MV data that has a given marginal distribution and a given correlation struucture. See The COPULA Procedure .

Another useful trick is to use mixtures of normals. For example, to simulate from a distribution that has longer tails than the normal, use a mixture of two normal, one with covariance S and the other with covariance kS, k>1.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-21-2013 05:50 PM

Dear Dr.Wicklin, THANK YOU SO MUCH for your help.You did great help. I am working with repeated model. I think your new book talk about everything What I need. I saw a comment of you that You talk about repeated data in two chapter of that book. I already ordered it. Until it come to me(end of next month?) I try to do some work of my research. I am sorry for my spelling mistake. i mean web=weibull Distribution. Thank for your help Sir...