09-12-2014 02:07 PM
I am trying to build a logistic regression model: using 7 variables (see below) to predict college enrollment (Enroll vs. Not-Enroll). When I put all seven variables in the model, the Hosmer and Lemeshow Goodness-of-Fit Test is significant, I think it suggests that the model does not fit the data well. I then tried variable selection and allowed two-way interactions to enter the selection, some of the interaction terms were selected for the final model, and the Hosmer and Lemeshow Goodness-of-Fit Test is still significant.
I also tried variable selection without including interaction terms. Four out of the seven variables were selected for the final model (race, sat score, legacy and year) and the Hosmer and Lemeshow Goodness-of-Fit p-value is significant as well. Does any one have suggestions on what I should do next?
A general question: Is the Hosmer and Lemeshow Goodness-of-Fit Test a good way to evaluate model fit? What do you usually use to evaluate model fit for logistic regression?
Any suggestion is appreciated.
09-12-2014 03:14 PM
Thanks so much for responding. My SAS code is below.
proc logistic data=asq;
class gender (ref='Male') race (ref='White') Aca_Ins (ref='Social Sciences') legacy (ref='No') first_gen(ref='No') year(ref='2014') /param=ref;
model enroll (event='Enroll')= gender_r race SAT_sum Aca_Ins legacy first_gen year/lackfit ;
for the default default model fit statistics, do you mean the following table? what should I compare to? Thank you!!
|-2 Log L||4016.729||3755.145|