# choose the most appropriate model

A company suspects that the number of sick days taken by a particular employee is dependent on the employee’s work load and the number of public holidays within each month. Human Resources have compiled data on the number of sick days taken by this employee per month for the last three years.

Variable Name                Description

Month                            The month in which the sick leave was taken

Year                               The year in which the sick leave was taken

WorkDays                      The number of working days in the given month/year

PubHols                          A dummy variable indicating whether there were any public holidays in the given month/year 0 No 1 Yes WorkLoad                       A normalised variable, centred at zero, measuring the work load of the employee, relative to the average,                                           in the given month/year

SickDays                        The number of sick days taken by the employee in the given month/year

The company has asked you to assess whether there is any evidence to support their suspicions that the employee is choosing to take sick leave based on his work load and/or public holidays.

What is the most appropriate model to use in this setting?

I am very confused this, how to get there

This is a statistical methodology question and not a SAS question.

It's also clearly a homework question.

Heres a table that can help to narrow down your selection. A common restriction is what you've learned in class.

http://www.ats.ucla.edu/stat/sas/whatstat/default.htm

Determining the type of analysis is the hardest part of Analytics.

proc logistic data=mydata.sick descending;
= none;
run;

I used this code to try to get answer, data fits well but the coefficient of the model are not significant, i just stuck.

If you need to factor in time, I assume some sort of time series analysis would be appropriate?

Poission Regression

