Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Forecasting
- /
- Re: glm model for severity

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 11-14-2016 05:40 AM
(2558 views)

i am using motor insurance data , the claim severity is given to me in that data set. based on that i have to model severity. data base is attached with question , i have tried proc severity procedure using gamma distribution:

proc severity data=libish.simulatedwithlog crit=aicc ;

loss severity;

scalemodel logduration / dfmixture =full ;

dist gamma ;

run;

and also applied usual gamma model using genmod procedure .

proc genmod data=libish.dataset2;

class premiumclass age zone;

model severity=premiumclass age /dist=gamma link=log type3;

run;

and the it is giving WARNING: Some observations with invalid response values have been deleted. The response was less than or equal to zero for the Gamma or Inverse Gaussian distributions or less than zero for the Negative Binomial or Poisson distributions.

actually how to conclude gamma model is suitable for this data?

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Rick has addressed GENMOD's warning. I will comment on why PROC SEVERITY doesn't throw a similar warning. PROC SEVERITY supports multiple distributions including your own distributions, so it allows 0 values for the "loss" (response) variable. It takes care of the 0 values in the distribution definition functions. In particular, for the gamma distribution, it uses the following defintion of the PDF function (you can see other functions of PROC SEVERITY's predefined gamma distribution here and all model definitions here😞

function GAMMA_PDF(x, Theta, Alpha); /* Theta : Scale */ /* Alpha : Shape */ minVal = 2.220446E-16; /* alternatives: MACEPS = 2.220446E-16 sqrt(SMALL)= 0.1491668147e-153 */ if (x < minVal) then do; x1 = minVal; /* assume exp(-x1/Theta)~1, because x1/Theta is too small */ p = x1**(Alpha-1) / (gamma(Alpha) * (Theta**Alpha)); end; else p = pdf("GAMMA", x, Alpha, Theta); return(p); endsub;

If you do not want this definition, you can always define your own version of gamma distribution that returns missing PDF and CDF values for 0-valued losses and try fitting it. See PROC SEVERITY documentation to find out how to define and fit your own distributions.

Now, coming back to your question, with your data that contains 0-valued losses, you will probably get some estimates from PROC SEVERITY because its standard gamma definition treats 0 values as very small values (=constant('MACEPS')), but you will need to look at the parameter estimates, fit statistics, and plots to see if it is indeed a good fit. In general, if you have lot of 0-valued response values, you should use a different distribution. The zero-inflated models mentioned by Rick are one option, but I would also suggest looking at the Tweedie distribution.

Hope this helps,

Mahesh

4 REPLIES 4

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

By default, the gamma distribution has a threashold parameter of zero, which means that a random variate from the gamma distribution will always be positive. In your data, you have three observations for which severity=0. The warning is telling you that those observations are dropped from the model, since they can't possibly come from a gamma-distributed variable.

For a similar question and some responses, see the thread "Zero-Inflated Gamma Model".

The options in that thread include slightly modifying the gamma deviance or changing to a different model. Model options include a zero-inflated gamma model or a Tweedie distribution.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thanks Rick

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Rick has addressed GENMOD's warning. I will comment on why PROC SEVERITY doesn't throw a similar warning. PROC SEVERITY supports multiple distributions including your own distributions, so it allows 0 values for the "loss" (response) variable. It takes care of the 0 values in the distribution definition functions. In particular, for the gamma distribution, it uses the following defintion of the PDF function (you can see other functions of PROC SEVERITY's predefined gamma distribution here and all model definitions here😞

function GAMMA_PDF(x, Theta, Alpha); /* Theta : Scale */ /* Alpha : Shape */ minVal = 2.220446E-16; /* alternatives: MACEPS = 2.220446E-16 sqrt(SMALL)= 0.1491668147e-153 */ if (x < minVal) then do; x1 = minVal; /* assume exp(-x1/Theta)~1, because x1/Theta is too small */ p = x1**(Alpha-1) / (gamma(Alpha) * (Theta**Alpha)); end; else p = pdf("GAMMA", x, Alpha, Theta); return(p); endsub;

If you do not want this definition, you can always define your own version of gamma distribution that returns missing PDF and CDF values for 0-valued losses and try fitting it. See PROC SEVERITY documentation to find out how to define and fit your own distributions.

Now, coming back to your question, with your data that contains 0-valued losses, you will probably get some estimates from PROC SEVERITY because its standard gamma definition treats 0 values as very small values (=constant('MACEPS')), but you will need to look at the parameter estimates, fit statistics, and plots to see if it is indeed a good fit. In general, if you have lot of 0-valued response values, you should use a different distribution. The zero-inflated models mentioned by Rick are one option, but I would also suggest looking at the Tweedie distribution.

Hope this helps,

Mahesh

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

thanks Mahesh

**Available on demand!**

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.