I use the following code to simulate some data but do not know how to output.
when I use print x it printed only 3 columns data, not 4 columns. Please help.
proc iml;
call randseed(1);
n = 1000;
Shape = {74, 104, 38, 24};
x = RandDirichlet(n,Shape);
print x;
Like this?
proc iml;
call randseed(1);
n = 1000;
Shape = {74, 104, 38, 24};
x = RandDirichlet(n,Shape);
print x;
create MyData var {x};
append;
close MyData;
quit;
yes, but this only output one column of data with 3000 rows.
I want to output a data set with 1000 rows and 4 columns
You have to list the columns but otherwise the process is the same.
See this blog post:
https://blogs.sas.com/content/iml/2011/04/18/writing-data-from-a-matrix-to-a-sas-data-set.html
@fengyuwuzu wrote:
yes, but this only output one column of data with 3000 rows.
I want to output a data set with 1000 rows and 4 columns
Why do you expect this to be 4 columns and not one? Just trying to figure out what you want to do.
Sorry, I did not make it clear in my original post.
My data has 4 categories, so each category has a percentage, and the 4 percentages add up to 1.
Now I want to simulate a random sample of these 4 percentages, using randdirichlet,
So I want to get random sample data set, which will have 1000 rows of 4 columns, each column is the
percentage of one category.
I guess I did not specify the shape correctly, so I only got 3 columns of percentages when I print it?
I have total (74+104+38+24) obs, and each number represents the count of one category.
Shape = {74, 104, 38, 24};
I figured out how to get the 4th percentage, but I am not sure if this is the right way:
proc iml;
call randseed(1);
n = 1000;
Shape = {50, 100, 150, 200}; /* use simple numbers for testing */
x = RandDirichlet(n,Shape); /* x is a 1000 x 3 matrix, not 1000 x 4 */
samplemean=mean(x); /* check mean */
print samplemean;
varnames='percentage1':'percentage3';
create MyData from x[colname=varnames]; /* mydata has only first 3 percentages columns */
append from x;
close MyData;
quit;
data mydata2;
set mydata;
percentage4=1-(percentage1+percentage2+percentage3); /* to get the 4th percentage */
run;
From the documentation:
The RANDDIRICHLET function generates a random sample from a Dirichlet distribution, which is a multivariate generalization of the beta distribution.
The input parameters are as follows:
is the number of observations to sample.
is a
vector of shape parameters for the distribution,
.
The RANDDIRICHLET function returns an
matrix that contains
random draws from the Dirichlet distribution.
Note that Shape is supposed to be p+1 vector, while it returns a N x p matrix.
@fengyuwuzu wrote:
Sorry, I did not make it clear in my original post.
My data has 4 categories, so each category has a percentage, and the 4 percentages add up to 1.
Now I want to simulate a random sample of these 4 percentages, using randdirichlet,
So I want to get random sample data set, which will have 1000 rows of 4 columns, each column is the
percentage of one category.
I guess I did not specify the shape correctly, so I only got 3 columns of percentages when I print it?
I have total (74+104+38+24) obs, and each number represents the count of one category.
Shape = {74, 104, 38, 24};
I figured out how to get the 4th percentage, but I am not sure if this is the right way:
proc iml; call randseed(1); n = 1000; Shape = {50, 100, 150, 200}; /* use simple numbers for testing */ x = RandDirichlet(n,Shape); /* x is a 1000 x 3 matrix, not 1000 x 4 */ samplemean=mean(x); /* check mean */ print samplemean; varnames='percentage1':'percentage3'; create MyData from x[colname=varnames]; /* mydata has only first 3 percentages columns */ append from x; close MyData; quit; data mydata2; set mydata; percentage4=1-(percentage1+percentage2+percentage3); /* to get the 4th percentage */ run;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.