BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Peaw
Fluorite | Level 6

I am working on fitting distributon to the data and now I am so confuse about the code.

I have found the example of creating a histogram to display lognormal fit and use code as follow;

 

title 'Lognormal dist.  ';

ods select Histogram Lognormal.ParameterEstimates Lognormal.GoodnessOfFit FitQuantiles;

proc univariate data=uy2013;

var avg_claim;

histogram / lognormal(w=3 theta=est)

odstitle = title;

inset n mean (5.3) std='Std Dev' (5.3) skewness (5.3) /

pos = ne

header = 'Summary Statistics';

run;

 

 

I would like to know 'Is this code for fitting two-parameter lognormal distribution?'

if it is, what theta=est is used for ??? 

the data that I used start from -0.100 but the threshold in the result is -772.2

why the threshold is that? (I am using Base SAS 9.4)

1 ACCEPTED SOLUTION

Accepted Solutions
Rick_SAS
SAS Super FREQ

Your statement is not correct. When you specify THETA=EST, you get a three-parameter fit.

The simple form of your call is

 

proc univariate data=uy2013;

var avg_claim;

histogram / lognormal(theta=est);   /* fit three-parameter lognormal distrib */

run;

 

If you want a two-parameter fit, specify a lower bound for the threshold parameter, or accept the default, which is THETA=0:

 

proc univariate data=uy2013;

var avg_claim;

histogram / lognormal(theta=1);   /* sets THETA=1 as threshold parameter (lower bound) */

run;

View solution in original post

9 REPLIES 9
ArtC
Rhodochrosite | Level 12

The THETA=EST option requests that the maximum likelihood estimate of theta be used as the threshold.

Peaw
Fluorite | Level 6

http://support.sas.com/documentation/cdl/en/procstat/70116/HTML/default/viewer.htm#procstat_univaria...

 

why this example specifies theta=est but the result of this is two-parameter lognormal distribution not three.

Rick_SAS
SAS Super FREQ

Your statement is not correct. When you specify THETA=EST, you get a three-parameter fit.

The simple form of your call is

 

proc univariate data=uy2013;

var avg_claim;

histogram / lognormal(theta=est);   /* fit three-parameter lognormal distrib */

run;

 

If you want a two-parameter fit, specify a lower bound for the threshold parameter, or accept the default, which is THETA=0:

 

proc univariate data=uy2013;

var avg_claim;

histogram / lognormal(theta=1);   /* sets THETA=1 as threshold parameter (lower bound) */

run;

Peaw
Fluorite | Level 6

thank you so much ,it helps a lot.
but may I ask you for more information.

 

After i run the statement as per your suggestion, please find my distribution output as attached.

 

title 'Lognormal dist.;

ods select Histogram Lognormal.ParameterEstimates Lognormal.GoodnessOfFit FitQuantiles;

proc univariate data=ec.uy2013;

var root_avg_jt;

where root_avg_jt ge 55 and root_avg_jt le 496;

histogram / lognormal(w=3 threshold=46.7)

odstitle = title;

inset n mean (5.3) std='Std Dev' (5.3) skewness (5.3) /

pos = ne

header = 'Summary Statistics';

run;

 

I am just curious:

1. I want to make sure that the output is 2-parameters lognormal (not 3 parameters).

2. how do we know the lower bound/theta? should I start from considering the data range (minimum value)? 


distribution output.JPG
Rick_SAS
SAS Super FREQ

Yes, your model is two-parameter when you specify the THRESHOLD= value.

 

The threshold value comes from using domain knowledge of the data. For example, the lognormal and Weibull distributions are often used to model time-to-failure for some component. The time must always be positive, so threshold=0 for that application. Most two-parameter families implicitly assume that the threshold is zero.

Peaw
Fluorite | Level 6
Thank you so much for your kindness.
it helps a lot.
Peaw
Fluorite | Level 6

Reference is made to my questions on the SAS communities regarding 2 or 3 parameters distribution.

As I’m still not sure about the output, could you please let me have more clarification as following?

The data that I used for fitting distribution is loss data (claim) with range [50 to 524] (after transforming data by square root).

Because, I don’t know how to set the threshold, I ran SAS as “threshold = est”. So I got the threshold value which is 46.9.

After that, I specified the mentioned threshold value of 46.9 in the histogram statement again. I think I got the 2 parameter log normal distribution (with p-value 0.017) as per suggestion.

However, could you please let me have your confirmation if this model is valid?

Additionally, I am not sure for the next step of “simulation”. Please advise us how to simulate the 2 parameter distribution with specify threshold like this case?

Rick_SAS
SAS Super FREQ

Technically you have fit a three-parameter distribution because you are using a threshold parameter that came from estimating the data. A  proper two-parameter family would use a threshold parameter that is based on domain-specific knowledge of the population distribution, not a sample.

 

However, I don't understand why you are worrying about this subtle aspect of the problem. If your goal is to simulate from a two-parameter lognormal distribution and thereby generate many samples that look like the observed data, then what you have done is perfectly fine.

 

For simulation, see the article "Simulate lognormal data in SAS."

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 2204 views
  • 3 likes
  • 4 in conversation