turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- simulation with two normal distributions

Topic Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

09-14-2011 06:06 PM

Hello. I am trying to simulate in sas about a case where two randomly generated normal distributions (say X and Y with 1000 observations)

have correlation of 0.5 (corr(X,Y)=0.5).

I know that corr(X,Y)= cov(X,Y)/(stdX*stdY)

where cov(X,Y)=E(XY)-E(X)E(Y).

Now my question is how do you figure out E(XY)?

Thanks for the help in advance!

Accepted Solutions

Solution

09-15-2011
04:04 PM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to DLing

09-15-2011 04:04 PM

All Replies

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to kzepilos

09-14-2011 06:15 PM

So, you want to calculate the conditional mean?

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to kzepilos

09-14-2011 06:26 PM

I might be missing something but from the simulation, multiply each x*y and take the average?

Or do you need an explicit mathematical formula?

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to kzepilos

09-14-2011 06:59 PM

the main purpose of this simulation is to generate two normal distributions with correlation of 0.5 using the random number generator rannor in SAS.

Since two normal dist. are not independent I don't know how i can come up with the joint density function of f(x,y) in order to figure out E(XY). Any other ideas?

Thanks for the help.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to kzepilos

09-15-2011 01:01 AM

You do not need to figure out E(XY), there is a formula for calculating correlation coefficient.

You can hard code to get that .I used Standard Normal Distribution N(0,1), but It looks like hard to get cov=0.5.

So I set it cov=0.1

%macro corr; %do %until(&found eq Y); data normal(drop=i); do i=1 to 1000; x=rannor(0); y=rannor(0); output; end; run; proc corr data=normal outp=corr(where=(_type_='CORR')) noprint ; var x; with y; run; %let found=N; data _null_; set corr; if round(x,.1) eq .1 then call symputx('found','Y'); run; %end; %mend corr; %corr

Ksharp

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Ksharp

09-15-2011 09:10 AM

Thanks alot for your help. It definitely helped!

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to kzepilos

09-15-2011 03:00 PM

There's no need to hunt for the correlation. It is generated through this process:

- Generate two separate random normal variables. call them x1, x2.
- If you want "drawn from a population with correlation R", skip this. If you want the drawn sample to have correlation R, then do this step: put the two columns through principle components (PROC PRINCOMP or PROC FACTOR) to generate two factors that are completely uncorrelated. The reason is that x1, x2 from above might have non-zero correlation.
- Use this formula: Y = x1 * R + x2 * sqrt( 1 - R**2 ). Y will be correlated with x1 with exactly correlation R.

For more variables, in general, take the components matrix and post multiply by Cholesky decomposition of R, the correlation matrix.

%let r=0.5;

data test;

call streaminit( 123 );

do i = 1 to 1000;

x1 = rand('normal');

x2 = rand('normal');

output;

end;

drop i;

run;

proc princomp data=test out=pc;

var x1 x2;

run;

data test1;

set pc;

yr = x1 * &r + x2 * sqrt( 1 - (&r)**2 );

yp = prin1 * &r + prin2 * sqrt( 1 - (&r)**2 );

run;

proc corr;

var x1 x2 yr prin1 prin2 yp;

run;

Solution

09-15-2011
04:04 PM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to DLing

09-15-2011 04:04 PM

Thanks DLing!