From the documentation:
The RANDDIRICHLET function generates a random sample from a Dirichlet distribution, which is a multivariate generalization of the beta distribution.
The input parameters are as follows:
N
is the number of observations to sample.
Shape
is a vector of shape parameters for the distribution, .
The RANDDIRICHLET function returns an matrix that contains random draws from the Dirichlet distribution.
Note that Shape is supposed to be p+1 vector, while it returns a N x p matrix.
@fengyuwuzu wrote:
Sorry, I did not make it clear in my original post.
My data has 4 categories, so each category has a percentage, and the 4 percentages add up to 1.
Now I want to simulate a random sample of these 4 percentages, using randdirichlet,
So I want to get random sample data set, which will have 1000 rows of 4 columns, each column is the
percentage of one category.
I guess I did not specify the shape correctly, so I only got 3 columns of percentages when I print it?
I have total (74+104+38+24) obs, and each number represents the count of one category.
Shape = {74, 104, 38, 24};
I figured out how to get the 4th percentage, but I am not sure if this is the right way:
proc iml;
call randseed(1);
n = 1000;
Shape = {50, 100, 150, 200}; /* use simple numbers for testing */
x = RandDirichlet(n,Shape); /* x is a 1000 x 3 matrix, not 1000 x 4 */
samplemean=mean(x); /* check mean */
print samplemean;
varnames='percentage1':'percentage3';
create MyData from x[colname=varnames]; /* mydata has only first 3 percentages columns */
append from x;
close MyData;
quit;
data mydata2;
set mydata;
percentage4=1-(percentage1+percentage2+percentage3); /* to get the 4th percentage */
run;
https://support.sas.com/documentation/cdl/en/imlug/65547/HTML/default/viewer.htm#imlug_modlib_sect019.htm
... View more