turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-01-2015 08:35 PM

Hi Experts:

I observe there are different SAS statement to generate exponential and gamma distribution. This is what I have used.

seed=2345;

theta=0.5; *scale parameter;

V=theta*ranexp(seed);

W=2*rangam(seed,theta)

These are not giving me what I want when I try to estimate the parameters of the generated distribution.

Below is the distribution I am trying to generate:

Vi ∼ Exponential(θ);i = 1,2,...n

Wi ∼ Gamma(2,θ);i = 1,2,...n

To be sure that my generation is okay, I want to estimate the parameters of the generated distribution to see how close it is to the value I use in generating the distribution.

Thanks in advance

Jack

Accepted Solutions

Solution

08-02-2015
07:05 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-02-2015 07:05 AM

I assume you want the solution in IML? There are many parameterization of distribution functions, but it sounds like you want

1) Expoentiatl with scale parameter sigma=0.5

2) Gamma with shape parameter alpha=2 and shape parameter sigma=0.5;

For both situations you can allocate a vector (or matrix) and fill it up by calling the RANDGEN subroutine. The following generates vectors with 1,000 variates. The program then writes the simulated data to a data set and call PROC UNIVARIATE to verify that the MLE estimates are close to the specified values:

proc iml;

call randseed(1);

sigma = 0.5; /* scale parameter */

E = j(1000, 1);

call randgen(E, "Exponential", sigma); /* E ~ Expo(0.5) */

G = j(1000, 1);

call randgen(G, "Gamma", 2, sigma); /* G ~ Gamma(2, 0.5) */

create test var {E G}; append; close;

quit;

proc univariate data=test;

histogram E / exponential(scale=EST) endpoints=(0 to 5 by 0.25);

histogram G / gamma(shape=EST scale=EST) endpoints=(0 to 5 by 0.25);

run;

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-01-2015 11:39 PM

Proc Univariate will generate distributions parameters

Base SAS(R) 9.2 Procedures Guide: Statistical Procedures, Third Edition

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-02-2015 12:16 AM

Thanks Reeza.

I used this univariate procedure earlier, but the parameter estimates that I obtained are far from what I used in simulation. Perhaps, I am not simulating the data correctly.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-02-2015 12:08 AM

RANDGEN() can't get it ?

call randgen(x, 'EXPO');

If you want estimate distribution's parameter , check the following link by Rick , how to get the best parameter by Maximizing Likelihood Function.

http://blogs.sas.com/content/iml/2011/10/12/maximum-likelihood-estimation-in-sasiml.html

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-02-2015 12:19 AM

Thanks. What's the difference between randgen and randexp?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-02-2015 12:27 AM

According to @Rick , It is obsolete function which couldn't generate real random number ,and it is deprecated in Documentation.

However RANDGEN() does , Therefore, always use RANDGEN() to generate any sort of distribution , NOT randXXX() .

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-02-2015 12:52 AM

Thanks Xia. I will try and see what I get

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-02-2015 07:08 AM

The numbers from RANDGEN are not "real" either. Both algorithms generate pseudo-random numbers. It is just that the algorithm that RANDGEN uses is more sophisticated. For details, see

"Six reasons you should stop using the RANUNI function to generate random numbers"

Solution

08-02-2015
07:05 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-02-2015 07:05 AM

I assume you want the solution in IML? There are many parameterization of distribution functions, but it sounds like you want

1) Expoentiatl with scale parameter sigma=0.5

2) Gamma with shape parameter alpha=2 and shape parameter sigma=0.5;

For both situations you can allocate a vector (or matrix) and fill it up by calling the RANDGEN subroutine. The following generates vectors with 1,000 variates. The program then writes the simulated data to a data set and call PROC UNIVARIATE to verify that the MLE estimates are close to the specified values:

proc iml;

call randseed(1);

sigma = 0.5; /* scale parameter */

E = j(1000, 1);

call randgen(E, "Exponential", sigma); /* E ~ Expo(0.5) */

G = j(1000, 1);

call randgen(G, "Gamma", 2, sigma); /* G ~ Gamma(2, 0.5) */

create test var {E G}; append; close;

quit;

proc univariate data=test;

histogram E / exponential(scale=EST) endpoints=(0 to 5 by 0.25);

histogram G / gamma(shape=EST scale=EST) endpoints=(0 to 5 by 0.25);

run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-02-2015 09:05 AM

Thanks!!!! Worked like a charm!!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-10-2015 09:33 PM

Hi Rick,

Please, after reading your post I have began to learn how to IML for simulation

The following program works perfectly well; it gives me what I want. However, in data step, I can easily create sample ID using a do "rep=1 to m" and analyze by that sample ID. By sample ID, I mean replicate. How do I generate the sample ID (replicate) in IML so that I can analyze my data by sample ID?

Thanks in advance:

proc iml;

call randseed(1);

sigma = 0.5; /* scale parameter */

E = j(1000, 1);

call randgen(E, "Exponential", sigma); /* E ~ Expo(0.5) */

G = j(1000, 1);

call randgen(G, "Gamma", 2, sigma); /* G ~ Gamma(2, 0.5) */

create test var {E G}; append; close;

quit;

proc univariate data=test;

histogram E / exponential(scale=EST) endpoints=(0 to 5 by 0.25);

histogram G / gamma(shape=EST scale=EST) endpoints=(0 to 5 by 0.25);

run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-11-2015 03:36 AM

There are probably lots of ways of solving this. Your data step solution could be made to work in IML too, as you could write a loop and then APPEND inside, each time adding records with the loop variable and a single random number. But if you want to retain the efficiency of Rick's code that generates a vector of random numbers all at once, then you will need to create vectors the same length, that contain the required ID information. I like to use the direct product operator '@' for this as follows (assuming you want to generate 10 reps for each of 100 IDs):

```
proc iml;
call randseed(1);
sigma = 0.5; /* scale parameter */
E = j(1000, 1);
call randgen(E, "Exponential", sigma); /* E ~ Expo(0.5) */
G = j(1000, 1);
call randgen(G, "Gamma", 2, sigma); /* G ~ Gamma(2, 0.5) */
ID = t(1:100) @ j(10,1);
REP = j(100,1) @ t(1:10);
create test var {ID REP E G}; append; close;
quit;
```

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-11-2015 08:45 AM

If you have my book *Simulating Data with SAS*, you can read about this on p. 69 and throughout the book.

Ian is a whiz with the Kronecker direct product (@). I am less proficient, so I tend to use the REPEAT function. You can read about my approach in the article "Create an ID vector for repeated measurements."

PS. Please address your questions to the list, not to me personally. Thanks.