Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

☑ This topic is **solved**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 04-03-2024 06:46 PM
(733 views)

Hi,

I am trying to generate a single column of data (length N=250,1000, 2000), from a Beta distribution with specified skewness and kurtosis (I want a range of skewness and kurtosis: skew = 0,1,2 and kurt = 3,5,7). Given values for Skew and Kurt, I am not able to back-solve for the Beta shape parameters A and B. Is there a formula to go from Skew and Kurt to A and B?

How can I get the data I want? I found this post here which is similar to my problem:

Should I just use the RandFleishman code and modify the FLFUNC and FLDERIV functions be the Beta function and it's first derivative? I am not sure how to go about this.

```
/* Newton's method to find roots of a function.
You must supply the FLFUNC and FLDERIV functions
that compute the function and the Jacobian matrix.
Input: x0 is the starting guess
optn[1] = max number of iterations
optn[2] = convergence criterion for || f ||
Output: x contains the approximation to the root */
start Newton(x, x0, optn);
maxIter = optn[1]; converge = optn[2];
x = x0;
f = FLFunc(x);
do iter = 1 to maxIter while(max(abs(f)) > converge);
J = FLDeriv(x);
delta = -solve(J, f); /* correction vector */
x = x + delta; /* new approximation */
f = FLFunc(x);
end;
/* return missing if no convergence */
if iter > maxIter then x = j(nrow(x0),ncol(x0),.);
finish Newton;
```

Are there otehr modifications I need to make to the RandFleishman code?

I am new to simulating data.

@Rick_SAS or others, please help! Thank you.

- Tags:
- beta
- simulation

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I don't think you need a separate post. This thread is fine.

As I have already told you, it is impossible to have a probability distribution for which kurt=3 when skew>1.5. You can only use feasible pairs of (skew,kurt) values. In general, the impossible region is defined by kurt >= 1 + skew**2. However, the Fleishman family cannot model the most extreme distributions. Here is some DATA step code to get only the feasible pairs that can be fit by the Fleishman family:

```
/* create (skew,kurt) values for skew > 0 that can be fit by Fleishman family */
data FeasSkewKurt;
do skew = 0 to 2.4 by 0.2;
do kurt = -2 to 10 by 0.5;
/* keep only valid pairs */
if kurt > (-1.2264489 + 1.6410373* skew**2) then output;
end;
end;
run;
```

The main question you need to answer is WHAT DISTRIBUTIONS do you want to simulate from? You originally said beta distributions, which are bounded. You can either choose from standard families (such as beta, gamma, lognormal,...) and try to get a wide range of (skew,kurt) values, or you can use a flexible family of distributions such as the Fleishman family or the Johnson system. Using a family such as truncated normals or a mixture of normals is going to greatly complicate your life, so I do not recommend using those families. (The problem is that it is hard to find parameter values for each (skew,kurt) pair when you use those distributions.)

The basic idea of what you are trying to do is discussed and implemented in *Simulating Data with SAS* (Wicklin, 2013) in Chapter 16 "Moment Matching and the Moment-Ratio Diagram." In that chapter, I used the Fleishman family, but the same ideas apply to the Johnson system. I recommend either of those families. If you do not have access to that book, you can get the Fleishman functions for free from Appendix D, which is available at https://support.sas.com/en/books/authors/rick-wicklin.html You are comfortable using SAS/IML to simulate the data, you could then write the samples to a data set and use the simulated data anywhere in SAS.

4 REPLIES 4

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Some of the (skew, kurt) values that you mention are not reachable for a beta(a,b) distribution. Others are limiting cases for the beta but are not properly beta distributions. For example,

- The (skew,kurt)=(2,3) is an impossible combination that is not obtainable by ANY probability distribution.
- The pair (0,3) specifies the moments for the normal distribution. The normal distribution is an asymptotic limit of the beta family when a=b and a -> infinity.

Please read about the moment-ratio diagram, which shows the feasible (skew,kurt) values for common families.

After you choose feasible (skew, kurt) values, then find the (a,b) values that correspond to them by solving the nonlinear equations that relate (a,b) to (skew, kurt). You can then simulate from the beta(a,b) distribution in each case.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thank you, @Rick_SAS !

Yes, I suppose I have to find a nonlinear solver to get me the shape values a and b, given Skew and Kurtosis.

Not sure if I need a different post but what if I choose to go with a mixture of Normalsan like in the image below, but truncated on the left? I am going for something like below, but on a bounded interval (on one side only). I am looking to try cross different Skew and Kurtosis values : Skew in 0 to 3 and Kurt in (3,5,7)

(From *Allison J. Ames, Brian C. Leventhal & Nnamdi C. Ezike (2020) Monte Carlo**Simulation in Item Response Theory Applications Using SAS, Measurement: Interdisciplinary**Research and Perspectives, 18:2, 55-74, DOI: 10.1080/15366367.2019.1689762** https://doi.org/10.1080/15366367.2019.1689762 )*

I found this from your blog:

https://blogs.sas.com/content/iml/2019/04/29/normal-mixture-distribution-sas.html and Implement the truncated normal distribution in SAS - The DO Loop ?

Sorry, too many questions. Any advice would help get me started. Thank you again!

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I don't think you need a separate post. This thread is fine.

As I have already told you, it is impossible to have a probability distribution for which kurt=3 when skew>1.5. You can only use feasible pairs of (skew,kurt) values. In general, the impossible region is defined by kurt >= 1 + skew**2. However, the Fleishman family cannot model the most extreme distributions. Here is some DATA step code to get only the feasible pairs that can be fit by the Fleishman family:

```
/* create (skew,kurt) values for skew > 0 that can be fit by Fleishman family */
data FeasSkewKurt;
do skew = 0 to 2.4 by 0.2;
do kurt = -2 to 10 by 0.5;
/* keep only valid pairs */
if kurt > (-1.2264489 + 1.6410373* skew**2) then output;
end;
end;
run;
```

The main question you need to answer is WHAT DISTRIBUTIONS do you want to simulate from? You originally said beta distributions, which are bounded. You can either choose from standard families (such as beta, gamma, lognormal,...) and try to get a wide range of (skew,kurt) values, or you can use a flexible family of distributions such as the Fleishman family or the Johnson system. Using a family such as truncated normals or a mixture of normals is going to greatly complicate your life, so I do not recommend using those families. (The problem is that it is hard to find parameter values for each (skew,kurt) pair when you use those distributions.)

The basic idea of what you are trying to do is discussed and implemented in *Simulating Data with SAS* (Wicklin, 2013) in Chapter 16 "Moment Matching and the Moment-Ratio Diagram." In that chapter, I used the Fleishman family, but the same ideas apply to the Johnson system. I recommend either of those families. If you do not have access to that book, you can get the Fleishman functions for free from Appendix D, which is available at https://support.sas.com/en/books/authors/rick-wicklin.html You are comfortable using SAS/IML to simulate the data, you could then write the samples to a data set and use the simulated data anywhere in SAS.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thank you @Rick_SAS , for your patient answer and advice! Yes, the main problem is making sure the Skew and Kurtosis match up with each other , withing a legitimate distribution.

Very kind of you to help!

**Available on demand!**

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.