BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
mkyron
Calcite | Level 5

I'm trying to find the generalized pareto parameters for a univariate distribution, with the aim of simulating a dataset based on these parameters after. Proc univariate in the version of SAS I have doesn't allow for survey weights, so I have instead relied on the severity procedure. 

 

proc severity data=hnw_severity PLOTS=CDFPERDIST plots=pp plots=PDFPERDIST;
weight weighting;
loss Overall_Wealth;
dist Gpd;

run;

 

I then get the following results:

 

Parameter Estimate StandardError   t           Pr > |t|

Theta        683294    6774                100.87 <.0001
Xi              0.36013 

 

It has been somewhat difficult to track down the exact formula for simulating data. I've tried proc iml with the randgen function, but I don't believe this extends to GPD. Any guidance in using these parameters to simulate a new dataset would be greatly appreciated.

1 ACCEPTED SOLUTION

Accepted Solutions
Rick_SAS
SAS Super FREQ

Unfortunately, there are several distributions that all use the name "generalized Pareto." I encourage the OP to make sure that the distribution in PROC SEVERITY is the form that they want. See https://blogs.sas.com/content/iml/2018/11/05/fit-pareto-distribution-sas.html 

 

As Koen says, p. 111 of Simulating Data with SAS shows how to simulate from the generalized Pareto distribution that PROC SEVERITY uses. Here is the code for the OP's parameter estimates:

%let N = 5000;
data GPD;
theta = 683294;
xi = 0.36013;
do i = 1 to &N;
   /* Generalized Pareto(scale=theta, shape=xi) */   
   U = rand("Uniform");
   X = theta/xi * (U**(-xi)-1);
   output;
end;
drop i U;
run;

If you fit this random sample by using PROC SEVERITY (no weights), you should get parameter estimates that are close to the specified parameters.

View solution in original post

5 REPLIES 5
sbxkoenk
SAS Super FREQ

Hello,

 

In this paper they simulate data with the Generalized Pareto Distribution (GPD) :

 

MWSUG 2018 - Paper AA-109
Application of heavy-tailed distribution using PROC IML, NLMIXED, and SEVERITY
Palash Sharma, University of Kansas Medical Center, Kansas City, KS
John Keighley, Ph.D. University of Kansas Medical Center, Kansas City, KS
https://www.mwsug.org/proceedings/2018/AA/MWSUG-2018-AA-109.pdf

 

Look also here :
MWSUG 2019 - Paper IN-116
Simulating Skewed Multivariate Distributions Using SAS®: Cases of Lomax, Mardia’s Pareto (Type I), Logistic, Burr and F Distributions
Zhixin Lun, Oakland University, Rochester, MI
Ravindra Khattree, Oakland University, Rochester, MI
https://www.mwsug.org/proceedings/2019/IN/MWSUG-2019-IN-116.pdf

 

@Rick_SAS : anything to add ?
I know that in your book "Simulating Data with SAS" you are also covering the
<< Generalized Pareto Distribution >>.

 

Good luck,

Koen

 

Rick_SAS
SAS Super FREQ

Unfortunately, there are several distributions that all use the name "generalized Pareto." I encourage the OP to make sure that the distribution in PROC SEVERITY is the form that they want. See https://blogs.sas.com/content/iml/2018/11/05/fit-pareto-distribution-sas.html 

 

As Koen says, p. 111 of Simulating Data with SAS shows how to simulate from the generalized Pareto distribution that PROC SEVERITY uses. Here is the code for the OP's parameter estimates:

%let N = 5000;
data GPD;
theta = 683294;
xi = 0.36013;
do i = 1 to &N;
   /* Generalized Pareto(scale=theta, shape=xi) */   
   U = rand("Uniform");
   X = theta/xi * (U**(-xi)-1);
   output;
end;
drop i U;
run;

If you fit this random sample by using PROC SEVERITY (no weights), you should get parameter estimates that are close to the specified parameters.

sbxkoenk
SAS Super FREQ

@Rick_SAS wrote:

Unfortunately, there are several distributions that all use the name "generalized Pareto." I encourage the OP to make sure that the distribution in PROC SEVERITY is the form that they want. 


Absolutely true !! Might be confusing.
Generalized Pareto , Generalized Gamma ... there are 3-parameter versions, 4-parameter versions (maybe 2-parameter versions?). There are also multiple flavours (parameterizations) of the formulae coming down to exactly the same thing of course.
So, I encourage OP as well (just as Rick) to make absolutely sure they are fitting and simulating what they want to fit and simulate.

Last but not least :
If NLMIXED, SEVERITY, UNIVARIATE and so on can not estimate the parameters, I have always successfully done it with PROC OPTMODEL (SAS/OR & SAS Optimization). Take care : you should not make an error in the maximum likelihood equation that you have to specify as an objective function!

 

Cheers,

Koen

mkyron
Calcite | Level 5

Thanks all for the responses. This helped to get the solution I needed and also learn a lot more on the topic...much appreciated!

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 985 views
  • 4 likes
  • 4 in conversation