- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I need to generate random values for two beta-distributed variables that are correlated. The two variables of interest are characterized as follows:
----
X1 has mean = 0.896 and variance = 0.001.
X2 has mean = 0.206 and variance = 0.004.
For X1 and X2, p = 0.5, where p is the correlation coefficient.
----
I understand how to generate a random number specifying a beta distribution using the function X = RAND('BETA', a, b), where a and b are the two shape parameters for a variable X that can be calculated from the mean and variance. However, I want to generate values for both X1 and X2 simultaneously while specifying that they are correlated at p = 0.5.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Looks like a question for Rick Wicklin
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
This is a duplicate of the same question asked on StackOverflow.
Run the SAS/IML program on p. 166 of Simulating Data with SAS, but substiture the Beta distribution for the Gamma and Exponential variables that appear in the book.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks Rick--this is the solution I came to yesterday. Thanks for producing such a fantastic resource.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
What values did you get for alpha_1, beta_1 and alpha_2, beta_2?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data corr_vars;
input x1 var1 x2 var2; *var1 and var2 are the variances for x1 and x2;
a_x1 = ((1 - x1) / var1 - 1/ x1) * x1**2;
a_x2 = ((1 - x2) / var2 - 1/ x2) * x2**2;
b_x1 = a_x1 * (1 / x1 - 1);
b_x2 = a_x2 * (1 / x2 - 1);
datalines;
0.896 0.001 0.207 0.004
;
proc print data = corr_vars;
run;
Therefore:
alpha1 = 82.597
beta1 = 9.587
alpha2 = 8.289
beta2 = 31.750
Then, here is the code I used to generate the correlated rates based on the book chapter:
proc iml;
call randseed(12345);
N = 10000; *number of random variable sets to generate;
Z = RandNormal(N, {0, 0}, {1 0.5, 0.5 1}); *RandNormal(N, Mean, Cov);
U = cdf("Normal", Z);
x1_beta = quantile('BETA', U[,1], 82.597, 9.587);
x2_beta = quantile('BETA', U[,2], 8.289, 31.750);
X = x1_beta || x2_beta; *here are my correlated variables, beta-distributed;
rhoZ = corr(Z)[1,2]; *check correlations;
rhoX = corr(X)[1,2];
print X;
print rhoZ rhoX;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If you are interested in Pearson correlations, the correlation for the MVN data is not exactly the correlation that you want for the beta variables. It needs to be modified, as explained on p. 167. For this example, I think you want to use
rho = 0.5105
for the MVN data in order to have the correlation ot the beta variables be 0.5.
For Spearman (rank) correlations, no adjustment is necessary.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Rick,
I am trying to apply the solution here to a slightly different problem: to generated auto-correlated time-series of say 24 intervals (24 hours in a day) with each interval following a Weibull distribution (of different parameters across hours). My understanding is that each value depends on the last one so they have to be generated in sequence. Any thoughts?
Thanks,
Bo
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Yes, I think if you have a different question then you should start a new thread instead of appending to a thread from 2015.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content