07-02-2015 09:08 AM
I work in insurance as an actuary and use SAS for preparing data. I use another software called Emblem to model claims frequency and severity.
I like the flexibility that SAS offers and would like to start doing the modeling work in SAS (which I haven't done before). Basically what I want to do is to be able to create programs that automate much of the tedious box-ticking work that I have to do manually in Emblem so that I can focus more time on the areas that require judgement. I know they can be automated since I spend a significant amount of time making decisions that can be preprogrammed. Before I explain further I have included a very basic example of the kind of dataset I would feed into Emblem (the actual dataset would have 10000s of rows).
|Vehicle Age||Vehicle Value||Main Driver License Type||Exposure (Years)||Claim Numbers||Claim Amounts|
Emblem is graphically pleasant and is good for inexperienced users who want to learn how to model. It basically lists all the rating factors on the left of the screen (vehicle age, etc.) and when you open the software none of the factors are fitted. You then work through each factor and test the significance of the factor by looking at:
how each level of the factor differs
testing whether the difference is consistent over time
then grouping levels that have similar (not statistically independent) predicted values
and so on
What I want to do is replicate a glim I've done in Emblem as a first step to understand how modeling in SAS is performed and ensuring the method I use produces the same results as in Emblem. Once I am familiar with that I can automate aspects of the program where applicable.
The problem I have is that I don't know where to start. Take the above dataset, suppose I wanted to model the effects of the rating factors on claim frequency, how would I do this in SAS. In the simple dataset there are three rating factors (vehicle age, vehicle value and license type). The exposed to risk is shown in the field "Exposure (Years)" and the response (what I'm trying to predict is Claim Frequency which is (Claim Numbers/Exposure (Years)). I want to be able to do the same thing as in Emblem so:
Analyze each factor in the same way I would do in Emblem so:
Calculate the Chi-squared for the factor
Calculate the significance of the parameter estimates
Calculate the significance of the independence of the parameter estimates.
This would be the obvious starting point for me.
I should point out that although I have sas/graph I'm not interested in producing graphs at this stage - I would prefer to do the analysis statistically at this stage and possibly produce graphs that can be used in log documents or other reports.
Any advice would be great.
07-02-2015 02:59 PM
Looks like you will want to use PROC GENMOD. You probably should read a few documents about the procedure, working through the examples in the documentation, before you are ready for your particular problem. We can help better after you try something (with your example code). Here are some good resources:
There are many more. You can try the introductory example in the GENMOD User's Guide.