Hi can any one help clarify doubt in goodness of fit in binary logistic. I created a model to predict the event response and got excellent c score of about .8 and also an attractive ROC carve. I checked the model by scoring the validation dataset and able to capture more than 70% of my responses in top 4 deciles. Also the KS score looks significant. Only thing I am having a trouble with is HL Hosmer Lemshow statistic coming significant means I need to rethink about the model. My question here is can i ignore the HL test and rely more on predictive power? Also, I am using proc logistic here. Appreciate any help on this.
The Hosmer-Lemeshow test is not my favorite test; it has low power in smaller samples and can show significance for important deviations in vary large ones. I much prefer to look at the observed-predicted plots themselves.
If the HL test is significant, it doesn't say that the model you have is "wrong," it says that it can be "improved." Sometimes "improvement" means a different model, additional variables, or data transformations. However, if the model seems adequate, I may just "declare victory" and move on.
One caution (that doesn't seem to matter here), if the HL test is significant, then it would be inappropriate to claim that nothing is going on (e.g. to "accept the null").
Doc Muhlbaier
Duke
The Hosmer-Lemeshow test is not my favorite test; it has low power in smaller samples and can show significance for important deviations in vary large ones. I much prefer to look at the observed-predicted plots themselves.
If the HL test is significant, it doesn't say that the model you have is "wrong," it says that it can be "improved." Sometimes "improvement" means a different model, additional variables, or data transformations. However, if the model seems adequate, I may just "declare victory" and move on.
One caution (that doesn't seem to matter here), if the HL test is significant, then it would be inappropriate to claim that nothing is going on (e.g. to "accept the null").
Doc Muhlbaier
Duke
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.