How can I get the Hosmer &Lemesahaw Goodness of Fit test to work on scored data, ( i.e. I want to score new data using a previously fitted model and get this test)
I'm using proc logistic to predict the probability y =1 (as an example)
When I develop the model using 5 years of data I test the model against the developmental data and ask for an ROC curve plot, area, % of correct predictions ( from a classification table) and the Hosmer and Lemeshaw Goodness of fit test.
I also want to score a 'validation' data set ( for this year's data). To do this ( by guessing) I ad a line of code:
SCORE DATA = PROJECTS.DATA_08 OUT= SCORE2 OUTROC = ROC_DATA FITSTAT;
This appears to give me the ROC plot and Fit Statistics for the 2008 data, but of course, I can't figure out how to get the Hosmer & Lemeshaw test on the 'scored' developmental data. ( I can't find a way to add a 'LACKFIT' option anywhere as I did for the developmental data set in the MODEL statment.
SCORE OUT = SCORE1 FITSTAT;
SCORE DATA = PROJECTS.ENRL_08 OUT= SCORE2 OUTROC = ROC_DATA FITSTAT;
/*THE ABOVE LINE APPEARS TO SCORE DESIGNATED DATA SET
GIVEN THE MODEL JUST DEVELOPED & GIVES A ROC PLOT, FITSTATS, NO HL*/
OUTPUT OUT=PRED RESDEV =RESDEV RESCHI =RESCHI H = HAT P = PHAT
LOWER =LCL UPPER = UCL PRED = PRED PREDPROB=(INDIVIDUAL CROSSVALIDATE)
Because the Hosmer & Lemeshow (HL) test is a summary statistic, it can't be part of the score dataset. You have to read the score dataset and compute the HL stat manually (the formula is in the documentation).
First run PROC RANK on the predicted probabilities to get the desired number of groups. Then sum the predicted probabilities in each group (that is the expected N) and count the number of events in that group (the observed N) and use the formula to compute the chi-square.
The HL test doesn't have very good power, so not rejecting does not say acceptable fit. A more sensitive, but subjective, measure is to plot the mean predicted probability of the event for each group on the x-axis and the observed proportion of events on the y-axis. A good fit is a straight line. Unfortunately, you can't put a number to it like you can interpret the c-statistic as the area under the ROC, you just have to look at a lot of data....
Thanks! This is the answer I need. I have a followup inquiry:
The purpose of my model is to assign probabilities to 'patrons' that use our service. I want to concentrate our efforts on those that the model predicts to be 'in the middle of the road.' They in theory would be most sensitive to promotions and interventions.
I'm getting a significant HL chi square at the developmental stage (implying poor fit), but have c = 74%, correct predictions at 64%. ( with similar results from scored data for 2008) Of course my max rescaled R-square for my model is only .20. Most all individual predictors are significant at .001 or .05 with expected signs.
Can you, or someone give me their opinion about how much importance I should attribute to the HL test with regards to the utility of my model? Most of the papers that I have reviewed don't even report HL results. In grad school the log likelihood based diagnostics and % of correct predictions were the main emphasis in my courses.
Greene confuses me even more with his remarks regarding maximum likelihood estimators:
It remains an interesting question for research whether fitting y well or obtaining good parameter estimates is a preferable estimation criterion. Evidently, they need not be the same thing.’ ( p. 686 Greene, Econometric Analysis 5th ed)