Hi Experts:
I observe there are different SAS statement to generate exponential and gamma distribution. This is what I have used.
seed=2345;
theta=0.5; *scale parameter;
V=theta*ranexp(seed);
W=2*rangam(seed,theta)
These are not giving me what I want when I try to estimate the parameters of the generated distribution.
Below is the distribution I am trying to generate:
Vi ∼ Exponential(θ);i = 1,2,...n
Wi ∼ Gamma(2,θ);i = 1,2,...n
To be sure that my generation is okay, I want to estimate the parameters of the generated distribution to see how close it is to the value I use in generating the distribution.
Thanks in advance
Jack
I assume you want the solution in IML? There are many parameterization of distribution functions, but it sounds like you want
1) Expoentiatl with scale parameter sigma=0.5
2) Gamma with shape parameter alpha=2 and shape parameter sigma=0.5;
For both situations you can allocate a vector (or matrix) and fill it up by calling the RANDGEN subroutine. The following generates vectors with 1,000 variates. The program then writes the simulated data to a data set and call PROC UNIVARIATE to verify that the MLE estimates are close to the specified values:
proc iml;
call randseed(1);
sigma = 0.5; /* scale parameter */
E = j(1000, 1);
call randgen(E, "Exponential", sigma); /* E ~ Expo(0.5) */
G = j(1000, 1);
call randgen(G, "Gamma", 2, sigma); /* G ~ Gamma(2, 0.5) */
create test var {E G}; append; close;
quit;
proc univariate data=test;
histogram E / exponential(scale=EST) endpoints=(0 to 5 by 0.25);
histogram G / gamma(shape=EST scale=EST) endpoints=(0 to 5 by 0.25);
run;
Proc Univariate will generate distributions parameters
Base SAS(R) 9.2 Procedures Guide: Statistical Procedures, Third Edition
Thanks Reeza.
I used this univariate procedure earlier, but the parameter estimates that I obtained are far from what I used in simulation. Perhaps, I am not simulating the data correctly.
RANDGEN() can't get it ?
call randgen(x, 'EXPO');
If you want estimate distribution's parameter , check the following link by Rick , how to get the best parameter by Maximizing Likelihood Function.
http://blogs.sas.com/content/iml/2011/10/12/maximum-likelihood-estimation-in-sasiml.html
Thanks. What's the difference between randgen and randexp?
According to @Rick , It is obsolete function which couldn't generate real random number ,and it is deprecated in Documentation.
However RANDGEN() does , Therefore, always use RANDGEN() to generate any sort of distribution , NOT randXXX() .
Thanks Xia. I will try and see what I get
The numbers from RANDGEN are not "real" either. Both algorithms generate pseudo-random numbers. It is just that the algorithm that RANDGEN uses is more sophisticated. For details, see
"Six reasons you should stop using the RANUNI function to generate random numbers"
I assume you want the solution in IML? There are many parameterization of distribution functions, but it sounds like you want
1) Expoentiatl with scale parameter sigma=0.5
2) Gamma with shape parameter alpha=2 and shape parameter sigma=0.5;
For both situations you can allocate a vector (or matrix) and fill it up by calling the RANDGEN subroutine. The following generates vectors with 1,000 variates. The program then writes the simulated data to a data set and call PROC UNIVARIATE to verify that the MLE estimates are close to the specified values:
proc iml;
call randseed(1);
sigma = 0.5; /* scale parameter */
E = j(1000, 1);
call randgen(E, "Exponential", sigma); /* E ~ Expo(0.5) */
G = j(1000, 1);
call randgen(G, "Gamma", 2, sigma); /* G ~ Gamma(2, 0.5) */
create test var {E G}; append; close;
quit;
proc univariate data=test;
histogram E / exponential(scale=EST) endpoints=(0 to 5 by 0.25);
histogram G / gamma(shape=EST scale=EST) endpoints=(0 to 5 by 0.25);
run;
Thanks!!!! Worked like a charm!!
Hi Rick,
Please, after reading your post I have began to learn how to IML for simulation
The following program works perfectly well; it gives me what I want. However, in data step, I can easily create sample ID using a do "rep=1 to m" and analyze by that sample ID. By sample ID, I mean replicate. How do I generate the sample ID (replicate) in IML so that I can analyze my data by sample ID?
Thanks in advance:
proc iml;
call randseed(1);
sigma = 0.5; /* scale parameter */
E = j(1000, 1);
call randgen(E, "Exponential", sigma); /* E ~ Expo(0.5) */
G = j(1000, 1);
call randgen(G, "Gamma", 2, sigma); /* G ~ Gamma(2, 0.5) */
create test var {E G}; append; close;
quit;
proc univariate data=test;
histogram E / exponential(scale=EST) endpoints=(0 to 5 by 0.25);
histogram G / gamma(shape=EST scale=EST) endpoints=(0 to 5 by 0.25);
run;
There are probably lots of ways of solving this. Your data step solution could be made to work in IML too, as you could write a loop and then APPEND inside, each time adding records with the loop variable and a single random number. But if you want to retain the efficiency of Rick's code that generates a vector of random numbers all at once, then you will need to create vectors the same length, that contain the required ID information. I like to use the direct product operator '@' for this as follows (assuming you want to generate 10 reps for each of 100 IDs):
proc iml;
call randseed(1);
sigma = 0.5; /* scale parameter */
E = j(1000, 1);
call randgen(E, "Exponential", sigma); /* E ~ Expo(0.5) */
G = j(1000, 1);
call randgen(G, "Gamma", 2, sigma); /* G ~ Gamma(2, 0.5) */
ID = t(1:100) @ j(10,1);
REP = j(100,1) @ t(1:10);
create test var {ID REP E G}; append; close;
quit;
If you have my book Simulating Data with SAS, you can read about this on p. 69 and throughout the book.
Ian is a whiz with the Kronecker direct product (@). I am less proficient, so I tend to use the REPEAT function. You can read about my approach in the article "Create an ID vector for repeated measurements."
PS. Please address your questions to the list, not to me personally. Thanks.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.
Find more tutorials on the SAS Users YouTube channel.