Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Programming
- /
- SAS Procedures
- /
- simulating data from gamma and pareto distribution

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 07-18-2017 10:37 PM
(2004 views)

I am working on modeling data.

By using "Proc univariate" to fit data, distribution that I get is Gamma with theta=60500 and Pareto with theta=300000.

then, I simulate data from these distributions by using code as follow:

%let N=3500;

%let NumSamples = 120;

**data** simu.pareto1(keep=ID X);

do ID = **1** to &NumSamples;

a = **0.383045**;

k = **222548.1**;

call streaminit(**1234**);

do i = **1** to &N;

U = rand("Uniform");

X = k / U**(**1**/a);

output;

end;

end;

**run**;

but the output seems deviate from the fitted data quite much. I'm not sure if it's relevant with the unspecify theta on my code or not?

If yes, please let me know the code for simulation with specify theta (for both Gamma and Pareto).

4 REPLIES 4

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

You can simulate some random variate Y with threshold parameter theta (and possibly a scale parameter sigma, by using the RAND function as you do to create the standard variate X and then create Y = theta + sigma * X.

Your way of simulating a standard Pareto looks good if you want to do this in a datastep, since the RAND function does not support the Pareto distribution. The RANDGEN function in SAS/IML *does *however support it, so you can do it directly there. The RAND function also supports the Gamma distribution, so simply simulate a Gamma random variate and apply the transformation above.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Yes, your Pareto simulation is identical to the one on p. 113 of *Simulating Data with SAS*.

If you look at pp. 109-111, there is a section on "Adding Location and Scale Parameters."

Just add the Theta value:

X = Theta_Pareto + k / U**(**1**/a);

Similarly for the gamma simulation, use

G = Theta_gamma + rand("gamma", <scale param here>);

You can simulate it in the same DATA step that simulates the Pareto variable.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thank you very much for your kind suggestion above.

However, after I did as per your suggestion, I found out many issues as following:

- First of all, please let me clarify you that the data range of this fitted Pareto distribution is 300,000-800,000 (with theta =300,000). And then I simulated Pareto distribution without identifying theta.
- After I used the coding to add theta as per your suggestion, I think that the output statistic seems worse. Kindly find the compared statistic as below details:
Fitting Pareto distribution -> output1

Simulation without theta -> output2 (after cut off data, so the data range of is 300,000-800,000)

Simulation with theta -> output3 (after cut off data, so the data range of is 300,000-800,000)

As you can see that, the statistic of simulation with theta are different from the fitted Pareto quite much. I am not sure if these output are acceptable or not? If not, please let have your further suggestion.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

1. If you want us to match data, you need to supply some sample data in the form of a SAS DATA step. It is difficult to guess what difficulties you might be having without a common set of data that everyone can run.

2. Numerically speaking, I would suggest measuring units in thousands, so that your data are 300-800 (with theta =300). This is likely to be more robust.

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

**If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. **

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.