Solved: Generating exponential and gamma distribution

SWEETSAS · Posted 08-01-2015 08:35 PM

Hi Experts:

I observe there are different SAS statement to generate exponential and gamma distribution. This is what I have used.

seed=2345;

theta=0.5; *scale parameter;

V=theta*ranexp(seed);

W=2*rangam(seed,theta)

These are not giving me what I want when I try to estimate the parameters of the generated distribution.

Below is the distribution I am trying to generate:

Vi ∼ Exponential(θ);i = 1,2,...n

Wi ∼ Gamma(2,θ);i = 1,2,...n

To be sure that my generation is okay, I want to estimate the parameters of the generated distribution to see how close it is to the value I use in generating the distribution.

Thanks in advance

Jack

Rick_SAS · Posted 08-02-2015 07:05 AM

I assume you want the solution in IML? There are many parameterization of distribution functions, but it sounds like you want

1) Expoentiatl with scale parameter sigma=0.5

2) Gamma with shape parameter alpha=2 and shape parameter sigma=0.5;

For both situations you can allocate a vector (or matrix) and fill it up by calling the RANDGEN subroutine. The following generates vectors with 1,000 variates. The program then writes the simulated data to a data set and call PROC UNIVARIATE to verify that the MLE estimates are close to the specified values:

proc iml;
call randseed(1);

sigma = 0.5; /* scale parameter */
E = j(1000, 1);
call randgen(E, "Exponential", sigma); /* E ~ Expo(0.5) */
G = j(1000, 1);
call randgen(G, "Gamma", 2, sigma); /* G ~ Gamma(2, 0.5) */

create test var {E G}; append; close;

quit;

proc univariate data=test;
histogram E / exponential(scale=EST) endpoints=(0 to 5 by 0.25);
histogram G / gamma(shape=EST scale=EST) endpoints=(0 to 5 by 0.25);
run;

View solution in original post

Reeza · Posted 08-01-2015 11:39 PM

Proc Univariate will generate distributions parameters

Base SAS(R) 9.2 Procedures Guide: Statistical Procedures, Third Edition

SWEETSAS · Posted 08-02-2015 12:16 AM

Thanks Reeza.

I used this univariate procedure earlier, but the parameter estimates that I obtained are far from what I used in simulation. Perhaps, I am not simulating the data correctly.

Ksharp · Posted 08-02-2015 12:08 AM

RANDGEN() can't get it ?

call randgen(x, 'EXPO');

If you want estimate distribution's parameter , check the following link by Rick , how to get the best parameter by Maximizing Likelihood Function.

http://blogs.sas.com/content/iml/2011/10/12/maximum-likelihood-estimation-in-sasiml.html

SWEETSAS · Posted 08-02-2015 12:19 AM

Thanks. What's the difference between randgen and randexp?

Ksharp · Posted 08-02-2015 12:27 AM

According to @Rick , It is obsolete function which couldn't generate real random number ,and it is deprecated in Documentation.

However RANDGEN() does , Therefore, always use RANDGEN() to generate any sort of distribution , NOT randXXX() .

SWEETSAS · Posted 08-02-2015 12:52 AM

Thanks Xia. I will try and see what I get

Rick_SAS · Posted 08-02-2015 07:08 AM

The numbers from RANDGEN are not "real" either. Both algorithms generate pseudo-random numbers. It is just that the algorithm that RANDGEN uses is more sophisticated. For details, see

"Six reasons you should stop using the RANUNI function to generate random numbers"

Rick_SAS · Posted 08-02-2015 07:05 AM

I assume you want the solution in IML? There are many parameterization of distribution functions, but it sounds like you want

1) Expoentiatl with scale parameter sigma=0.5

2) Gamma with shape parameter alpha=2 and shape parameter sigma=0.5;

For both situations you can allocate a vector (or matrix) and fill it up by calling the RANDGEN subroutine. The following generates vectors with 1,000 variates. The program then writes the simulated data to a data set and call PROC UNIVARIATE to verify that the MLE estimates are close to the specified values:

proc iml;
call randseed(1);

sigma = 0.5; /* scale parameter */
E = j(1000, 1);
call randgen(E, "Exponential", sigma); /* E ~ Expo(0.5) */
G = j(1000, 1);
call randgen(G, "Gamma", 2, sigma); /* G ~ Gamma(2, 0.5) */

create test var {E G}; append; close;

quit;

proc univariate data=test;
histogram E / exponential(scale=EST) endpoints=(0 to 5 by 0.25);
histogram G / gamma(shape=EST scale=EST) endpoints=(0 to 5 by 0.25);
run;

SWEETSAS · Posted 08-02-2015 09:05 AM

Thanks!!!! Worked like a charm!!

SWEETSAS · Posted 09-10-2015 09:33 PM

Hi Rick,

Please, after reading your post I have began to learn how to IML for simulation

The following program works perfectly well; it gives me what I want. However, in data step, I can easily create sample ID using a do "rep=1 to m" and analyze by that sample ID. By sample ID, I mean replicate. How do I generate the sample ID (replicate) in IML so that I can analyze my data by sample ID?

Thanks in advance:

proc iml;
call randseed(1);

sigma = 0.5; /* scale parameter */
E = j(1000, 1);
call randgen(E, "Exponential", sigma); /* E ~ Expo(0.5) */
G = j(1000, 1);
call randgen(G, "Gamma", 2, sigma); /* G ~ Gamma(2, 0.5) */

create test var {E G}; append; close;

quit;

proc univariate data=test;
histogram E / exponential(scale=EST) endpoints=(0 to 5 by 0.25);
histogram G / gamma(shape=EST scale=EST) endpoints=(0 to 5 by 0.25);
run;

IanWakeling · Posted 09-11-2015 03:36 AM

There are probably lots of ways of solving this. Your data step solution could be made to work in IML too, as you could write a loop and then APPEND inside, each time adding records with the loop variable and a single random number. But if you want to retain the efficiency of Rick's code that generates a vector of random numbers all at once, then you will need to create vectors the same length, that contain the required ID information. I like to use the direct product operator '@' for this as follows (assuming you want to generate 10 reps for each of 100 IDs):

proc iml;
call randseed(1);

sigma = 0.5; /* scale parameter */
E = j(1000, 1);
call randgen(E, "Exponential", sigma); /* E ~ Expo(0.5) */
G = j(1000, 1);
call randgen(G, "Gamma", 2, sigma); /* G ~ Gamma(2, 0.5) */

ID = t(1:100) @ j(10,1);
REP = j(100,1) @ t(1:10);

create test var {ID REP E G}; append; close;

quit;

Rick_SAS · Posted 09-11-2015 08:45 AM

If you have my book Simulating Data with SAS, you can read about this on p. 69 and throughout the book.

Ian is a whiz with the Kronecker direct product (@). I am less proficient, so I tend to use the REPEAT function. You can read about my approach in the article "Create an ID vector for repeated measurements."

PS. Please address your questions to the list, not to me personally. Thanks.

Generating exponential and gamma distribution

Re: Generating exponential and gamma distribution

Re: Generating exponential and gamma distribution

Re: Generating exponential and gamma distribution

Re: Generating exponential and gamma distribution

Re: Generating exponential and gamma distribution

Re: Generating exponential and gamma distribution

Re: Generating exponential and gamma distribution

Re: Generating exponential and gamma distribution

Re: Generating exponential and gamma distribution

Re: Generating exponential and gamma distribution

Re: Generating exponential and gamma distribution

Re: Generating exponential and gamma distribution

Re: Generating exponential and gamma distribution

The 2025 SAS Hackathon has begun!