Sorry, I did not make it clear in my original post.
My data has 4 categories, so each category has a percentage, and the 4 percentages add up to 1.
Now I want to simulate a random sample of these 4 percentages, using randdirichlet,
So I want to get random sample data set, which will have 1000 rows of 4 columns, each column is the
percentage of one category.
I guess I did not specify the shape correctly, so I only got 3 columns of percentages when I print it?
I have total (74+104+38+24) obs, and each number represents the count of one category.
Shape = {74, 104, 38, 24};
I figured out how to get the 4th percentage, but I am not sure if this is the right way:
proc iml;
call randseed(1);
n = 1000;
Shape = {50, 100, 150, 200}; /* use simple numbers for testing */
x = RandDirichlet(n,Shape); /* x is a 1000 x 3 matrix, not 1000 x 4 */
samplemean=mean(x); /* check mean */
print samplemean;
varnames='percentage1':'percentage3';
create MyData from x[colname=varnames]; /* mydata has only first 3 percentages columns */
append from x;
close MyData;
quit;
data mydata2;
set mydata;
percentage4=1-(percentage1+percentage2+percentage3); /* to get the 4th percentage */
run;
... View more