BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
SWEETSAS
Obsidian | Level 7

Hi Experts:

I observe there are different SAS statement to generate exponential and gamma distribution. This is what I have used.

seed=2345;

theta=0.5; *scale parameter;

V=theta*ranexp(seed);

W=2*rangam(seed,theta)

These are not giving me what I want when I try to estimate the parameters of the generated distribution.

Below is the distribution I am trying to generate:

Vi Exponential(θ);i = 1,2,...n

Wi Gamma(2);i = 1,2,...n

To be sure that my generation is okay, I want to estimate the parameters of the generated distribution to see how close it is to the value I use in generating the distribution.

Thanks in advance

Jack

1 ACCEPTED SOLUTION

Accepted Solutions
Rick_SAS
SAS Super FREQ

I assume you want the solution in IML? There are many parameterization of distribution functions, but it sounds like you want

1) Expoentiatl with scale parameter sigma=0.5

2) Gamma with shape parameter alpha=2 and shape parameter sigma=0.5;

For both situations  you can allocate a vector (or matrix) and fill it up by calling the RANDGEN subroutine. The following generates vectors with 1,000 variates.  The program then writes the simulated data to a data set and call PROC UNIVARIATE to verify that the MLE estimates are close to the specified values:

proc iml;
call randseed(1);

sigma = 0.5; /* scale parameter */
E = j(1000, 1);
call randgen(E, "Exponential", sigma); /* E ~ Expo(0.5) */
G = j(1000, 1);
call randgen(G, "Gamma", 2, sigma); /* G ~ Gamma(2, 0.5) */

create test var {E G}; append; close;

quit;

proc univariate data=test;
histogram E / exponential(scale=EST) endpoints=(0 to 5 by 0.25);
histogram G / gamma(shape=EST scale=EST) endpoints=(0 to 5 by 0.25);
run;

View solution in original post

12 REPLIES 12
SWEETSAS
Obsidian | Level 7

Thanks Reeza.

I used this univariate procedure earlier, but the parameter estimates that I obtained are far from what I used in simulation. Perhaps, I am not simulating the data correctly.

Ksharp
Super User

RANDGEN() can't get it ?

call randgen(x, 'EXPO');

If you want estimate distribution's parameter , check the following link by Rick , how to get the best parameter by Maximizing Likelihood Function.

http://blogs.sas.com/content/iml/2011/10/12/maximum-likelihood-estimation-in-sasiml.html

SWEETSAS
Obsidian | Level 7

Thanks. What's the difference between randgen and randexp?

Ksharp
Super User

According to @Rick , It is obsolete function which couldn't generate real random number ,and it is deprecated in Documentation.

However RANDGEN() does , Therefore, always use RANDGEN() to generate any sort of distribution , NOT randXXX() .

SWEETSAS
Obsidian | Level 7

Thanks Xia. I will try and see what I get

Rick_SAS
SAS Super FREQ

The numbers from RANDGEN are not "real" either. Both algorithms generate pseudo-random numbers. It is just that the algorithm that RANDGEN uses is more sophisticated.  For details, see

"Six reasons you should stop using the RANUNI function to generate random numbers"

Rick_SAS
SAS Super FREQ

I assume you want the solution in IML? There are many parameterization of distribution functions, but it sounds like you want

1) Expoentiatl with scale parameter sigma=0.5

2) Gamma with shape parameter alpha=2 and shape parameter sigma=0.5;

For both situations  you can allocate a vector (or matrix) and fill it up by calling the RANDGEN subroutine. The following generates vectors with 1,000 variates.  The program then writes the simulated data to a data set and call PROC UNIVARIATE to verify that the MLE estimates are close to the specified values:

proc iml;
call randseed(1);

sigma = 0.5; /* scale parameter */
E = j(1000, 1);
call randgen(E, "Exponential", sigma); /* E ~ Expo(0.5) */
G = j(1000, 1);
call randgen(G, "Gamma", 2, sigma); /* G ~ Gamma(2, 0.5) */

create test var {E G}; append; close;

quit;

proc univariate data=test;
histogram E / exponential(scale=EST) endpoints=(0 to 5 by 0.25);
histogram G / gamma(shape=EST scale=EST) endpoints=(0 to 5 by 0.25);
run;

SWEETSAS
Obsidian | Level 7

Thanks!!!! Worked like a charm!!

SWEETSAS
Obsidian | Level 7

Hi Rick,

 

Please, after reading your post I have began to learn how to IML for simulation

 

The following program works perfectly well; it gives me what I want. However, in data step, I can easily create sample ID  using a do "rep=1 to m" and analyze by that sample ID. By sample ID, I mean replicate. How do I generate the sample ID (replicate) in IML so that I can analyze my data by sample ID?

 

Thanks in advance:

 

proc iml;
call randseed(1);

sigma = 0.5; /* scale parameter */
E = j(1000, 1);
call randgen(E, "Exponential", sigma); /* E ~ Expo(0.5) */
G = j(1000, 1);
call randgen(G, "Gamma", 2, sigma); /* G ~ Gamma(2, 0.5) */

 

create test var {E G}; append; close;

quit;

 

proc univariate data=test;
histogram E / exponential(scale=EST) endpoints=(0 to 5 by 0.25);
histogram G / gamma(shape=EST scale=EST) endpoints=(0 to 5 by 0.25);
run;

IanWakeling
Barite | Level 11

There are probably lots of ways of solving this.  Your data step solution could be made to work in IML too, as you could write a loop and then APPEND inside, each time adding records with the loop variable and a single random number.  But if you want to retain the efficiency of Rick's code that generates a vector of random numbers all at once, then you will need to create vectors the same length, that contain the required ID information.  I like to use the direct product operator '@' for this as follows (assuming you want to generate 10 reps for each of 100 IDs):

 

proc iml;
call randseed(1);

sigma = 0.5; /* scale parameter */
E = j(1000, 1);
call randgen(E, "Exponential", sigma); /* E ~ Expo(0.5) */
G = j(1000, 1);
call randgen(G, "Gamma", 2, sigma); /* G ~ Gamma(2, 0.5) */

ID = t(1:100) @ j(10,1);
REP = j(100,1) @ t(1:10);

create test var {ID REP E G}; append; close;

quit;
Rick_SAS
SAS Super FREQ

If you have my book Simulating Data with SAS, you can read about this on p. 69 and throughout the book.

 

Ian is a whiz with the Kronecker direct product (@). I am less proficient, so I tend to use the REPEAT function. You can read about my approach in the article "Create an ID vector for repeated measurements."

 

 

PS. Please address your questions to the list, not to me personally. Thanks.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

From The DO Loop
Want more? Visit our blog for more articles like these.
Discussion stats
  • 12 replies
  • 5138 views
  • 0 likes
  • 5 in conversation