BookmarkSubscribeRSS Feed
dart926
Calcite | Level 5

Hi all,

I am trying to build a logistic regression model: using 7 variables (see below) to predict college enrollment (Enroll vs. Not-Enroll). When I put all seven variables in the model, the Hosmer and Lemeshow Goodness-of-Fit Test is significant,  I think it suggests that the model does not fit the data well. I then tried variable selection and allowed two-way interactions to enter the selection, some of the interaction terms were selected for the final model, and the Hosmer and Lemeshow Goodness-of-Fit Test is still significant.


I also tried variable selection without including interaction terms. Four out of the seven variables were selected for the final model (race, sat score, legacy and year) and the Hosmer and Lemeshow Goodness-of-Fit p-value is significant as well. Does any one have suggestions on what I should do next?

A general question: Is the Hosmer and Lemeshow Goodness-of-Fit Test a good way to evaluate model fit?  What do you usually use to evaluate model fit for logistic regression?

Any suggestion is appreciated.

Yanmin

  1. Gender (Female vs. Male)
  2. Race (Non-US vs. Minority vs. White)
  3. SAT scores (numerical variable)
  4. Academic Interest (Sciences vs. Interdisciplinary vs. Humanities vs. Undecided vs. Social Sciences)
  5. Legacy (Yes vs. No)
  6. First generation (Yes vs. No)
  7. Year(2012 vs. 2013 vs. 2014)
6 REPLIES 6
stat_sas
Ammonite | Level 13

Hi,

Did you compare default model fit statistics before considering Hosmer and Lemeshow?


dart926
Calcite | Level 5

Hi there,

Thanks so much for responding. My SAS code is below.

proc logistic data=asq;

class gender (ref='Male')  race (ref='White')   Aca_Ins (ref='Social Sciences')  legacy (ref='No')  first_gen(ref='No')  year(ref='2014')  /param=ref;

model  enroll (event='Enroll')= gender_r race SAT_sum  Aca_Ins legacy  first_gen year/lackfit ;

run;

for the default default model fit statistics, do you mean the following table? what should I compare to? Thank you!!

AIC4018.7293781.145
SC4024.7083858.872
-2 Log L4016.7293755.145
stat_sas
Ammonite | Level 13

Yes, what about Testing Global Null Hypothesis: BETA=0?

dart926
Calcite | Level 5

Thank you!!!

I will read a bit more about logistic regression before continuing the analysis.  Will come back to you later. Smiley Happy

dart926
Calcite | Level 5

Thank you so much Reeza! This is very helpful!

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 1573 views
  • 6 likes
  • 3 in conversation