Hi all,
I am trying to build a logistic regression model: using 7 variables (see below) to predict college enrollment (Enroll vs. Not-Enroll). When I put all seven variables in the model, the Hosmer and Lemeshow Goodness-of-Fit Test is significant, I think it suggests that the model does not fit the data well. I then tried variable selection and allowed two-way interactions to enter the selection, some of the interaction terms were selected for the final model, and the Hosmer and Lemeshow Goodness-of-Fit Test is still significant.
I also tried variable selection without including interaction terms. Four out of the seven variables were selected for the final model (race, sat score, legacy and year) and the Hosmer and Lemeshow Goodness-of-Fit p-value is significant as well. Does any one have suggestions on what I should do next?
A general question: Is the Hosmer and Lemeshow Goodness-of-Fit Test a good way to evaluate model fit? What do you usually use to evaluate model fit for logistic regression?
Any suggestion is appreciated.
Yanmin
Hi,
Did you compare default model fit statistics before considering Hosmer and Lemeshow?
Hi there,
Thanks so much for responding. My SAS code is below.
proc logistic data=asq;
class gender (ref='Male') race (ref='White') Aca_Ins (ref='Social Sciences') legacy (ref='No') first_gen(ref='No') year(ref='2014') /param=ref;
model enroll (event='Enroll')= gender_r race SAT_sum Aca_Ins legacy first_gen year/lackfit ;
run;
for the default default model fit statistics, do you mean the following table? what should I compare to? Thank you!!
AIC | 4018.729 | 3781.145 |
---|---|---|
SC | 4024.708 | 3858.872 |
-2 Log L | 4016.729 | 3755.145 |
Yes, what about Testing Global Null Hypothesis: BETA=0?
Thank you!!!
I will read a bit more about logistic regression before continuing the analysis. Will come back to you later.
Thank you so much Reeza! This is very helpful!
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.