I'm trying to find the generalized pareto parameters for a univariate distribution, with the aim of simulating a dataset based on these parameters after. Proc univariate in the version of SAS I have doesn't allow for survey weights, so I have instead relied on the severity procedure.
proc severity data=hnw_severity PLOTS=CDFPERDIST plots=pp plots=PDFPERDIST;
weight weighting;
loss Overall_Wealth;
dist Gpd;
run;
I then get the following results:
Parameter Estimate StandardError t Pr > |t|
Theta 683294 6774 100.87 <.0001
Xi 0.36013
It has been somewhat difficult to track down the exact formula for simulating data. I've tried proc iml with the randgen function, but I don't believe this extends to GPD. Any guidance in using these parameters to simulate a new dataset would be greatly appreciated.
Unfortunately, there are several distributions that all use the name "generalized Pareto." I encourage the OP to make sure that the distribution in PROC SEVERITY is the form that they want. See https://blogs.sas.com/content/iml/2018/11/05/fit-pareto-distribution-sas.html
As Koen says, p. 111 of Simulating Data with SAS shows how to simulate from the generalized Pareto distribution that PROC SEVERITY uses. Here is the code for the OP's parameter estimates:
%let N = 5000;
data GPD;
theta = 683294;
xi = 0.36013;
do i = 1 to &N;
/* Generalized Pareto(scale=theta, shape=xi) */
U = rand("Uniform");
X = theta/xi * (U**(-xi)-1);
output;
end;
drop i U;
run;
If you fit this random sample by using PROC SEVERITY (no weights), you should get parameter estimates that are close to the specified parameters.
Hello,
In this paper they simulate data with the Generalized Pareto Distribution (GPD) :
MWSUG 2018 - Paper AA-109
Application of heavy-tailed distribution using PROC IML, NLMIXED, and SEVERITY
Palash Sharma, University of Kansas Medical Center, Kansas City, KS
John Keighley, Ph.D. University of Kansas Medical Center, Kansas City, KS
https://www.mwsug.org/proceedings/2018/AA/MWSUG-2018-AA-109.pdf
Look also here :
MWSUG 2019 - Paper IN-116
Simulating Skewed Multivariate Distributions Using SAS®: Cases of Lomax, Mardia’s Pareto (Type I), Logistic, Burr and F Distributions
Zhixin Lun, Oakland University, Rochester, MI
Ravindra Khattree, Oakland University, Rochester, MI
https://www.mwsug.org/proceedings/2019/IN/MWSUG-2019-IN-116.pdf
@Rick_SAS : anything to add ?
I know that in your book "Simulating Data with SAS" you are also covering the
<< Generalized Pareto Distribution >>.
Good luck,
Koen
Unfortunately, there are several distributions that all use the name "generalized Pareto." I encourage the OP to make sure that the distribution in PROC SEVERITY is the form that they want. See https://blogs.sas.com/content/iml/2018/11/05/fit-pareto-distribution-sas.html
As Koen says, p. 111 of Simulating Data with SAS shows how to simulate from the generalized Pareto distribution that PROC SEVERITY uses. Here is the code for the OP's parameter estimates:
%let N = 5000;
data GPD;
theta = 683294;
xi = 0.36013;
do i = 1 to &N;
/* Generalized Pareto(scale=theta, shape=xi) */
U = rand("Uniform");
X = theta/xi * (U**(-xi)-1);
output;
end;
drop i U;
run;
If you fit this random sample by using PROC SEVERITY (no weights), you should get parameter estimates that are close to the specified parameters.
@Rick_SAS wrote:
Unfortunately, there are several distributions that all use the name "generalized Pareto." I encourage the OP to make sure that the distribution in PROC SEVERITY is the form that they want.
Absolutely true !! Might be confusing.
Generalized Pareto , Generalized Gamma ... there are 3-parameter versions, 4-parameter versions (maybe 2-parameter versions?). There are also multiple flavours (parameterizations) of the formulae coming down to exactly the same thing of course.
So, I encourage OP as well (just as Rick) to make absolutely sure they are fitting and simulating what they want to fit and simulate.
Last but not least :
If NLMIXED, SEVERITY, UNIVARIATE and so on can not estimate the parameters, I have always successfully done it with PROC OPTMODEL (SAS/OR & SAS Optimization). Take care : you should not make an error in the maximum likelihood equation that you have to specify as an objective function!
Cheers,
Koen
Thanks all for the responses. This helped to get the solution I needed and also learn a lot more on the topic...much appreciated!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.