Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

Hosmer and Lemeshow Goodness-of-Fit on Validation Sample, Help...Many thanks

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 95
Accepted Solution

Hosmer and Lemeshow Goodness-of-Fit on Validation Sample, Help...Many thanks


Hi,

I would like to perform the Hosmer and Lemeshow test on my Validation set. I have managed to use it on the training that I have used to build the model, but really I want to run it on the validation sample (out of sample) to see if my model generalizes well....This below was my process for the Training and Output results....But I would like to know how to apply it to validation once I have scored it using the model created on the training. Your help will be much appreciated. Thank you..

proc logistic data=Xsell_File DESCENDING plots(only)=roc;

class VAR1

      VAR2

      VAR3

      VAR4

      VAR5

;

model Click_Flag = VAR1

      VAR2

      VAR3

      VAR4

      VAR5

/ selection=stepwise lackfit;

SCORE DATA=Validation OUT=Validation_Scores (RENAME=(P_1=p));

run ;

Partition for the Hosmer and Lemeshow Test
Click_Flag = 1Click_Flag = 0
GroupTotalObservedExpectedObservedExpected
109,236588600.528,6488635.48
99,664494496.839,1709167.17
89,668472442.139,1969225.87
79,656402403.199,2549252.81
69,716392374.069,3249341.94
59,711327346.889,3849364.12
49,612303319.569,3099292.44
39,665316297.079,3499367.93
29,665267271.169,3989393.84
110,047232241.619,8159805.39
Hosmer and Lemeshow Goodness-of-Fit
Test
Chi-SquareDFPr > ChiSq
7.0880.5281

Accepted Solutions
Solution
‎12-12-2013 10:10 PM
Respected Advisor
Posts: 4,644

Re: Hosmer and Lemeshow Goodness-of-Fit on Validation Sample, Help...Many thanks

I think you might be able to pull this off by writing your final model parameter estimates to a dataset with option OUTEST= and then calling PROC LOGISTIC again with your validation data set with DATA=, bringing in your previous model with INEST=, preventing a new fit with MAXITER=0 and requesting H-L test with LACKFIT.

Good luck

PG

PG

View solution in original post


All Replies
SAS Employee
Posts: 68

Re: Hosmer and Lemeshow Goodness-of-Fit on Validation Sample, Help...Many thanks

Hi, I just posted a similar response to Kanyange.  You would be best served posting this question in SAS STAT Community

I would consider this more of a stat than data mining question Smiley Happy

Thanks,

Jonathan

Frequent Contributor
Posts: 95

Re: Hosmer and Lemeshow Goodness-of-Fit on Validation Sample, Help...Many thanks

Hi Jonathan,

I am quite confused ,  I thought that modelling is part of DataMining??? Also this test helps to validate the model...to see if your actual and predicted are actually similar....as far as I am aware this is datamining...

Thanks

Super User
Posts: 17,784

Re: Hosmer and Lemeshow Goodness-of-Fit on Validation Sample, Help...Many thanks

I think DataMining here refers to Enterprise Miner Software. Logistic regression is a portion of data mining though, and is part of the e-Miner software suite Smiley Happy

Frequent Contributor
Posts: 95

Re: Hosmer and Lemeshow Goodness-of-Fit on Validation Sample, Help...Many thanks

Hi Reeza,

Thanks for your response..I am having troubles to get an answer on this, could you please help? Does EM have this test? Many thanks

Solution
‎12-12-2013 10:10 PM
Respected Advisor
Posts: 4,644

Re: Hosmer and Lemeshow Goodness-of-Fit on Validation Sample, Help...Many thanks

I think you might be able to pull this off by writing your final model parameter estimates to a dataset with option OUTEST= and then calling PROC LOGISTIC again with your validation data set with DATA=, bringing in your previous model with INEST=, preventing a new fit with MAXITER=0 and requesting H-L test with LACKFIT.

Good luck

PG

PG
Frequent Contributor
Posts: 95

Re: Hosmer and Lemeshow Goodness-of-Fit on Validation Sample, Help...Many thanks

Thank you very much PG, that's really helpful...Smiley Happy

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 986 views
  • 0 likes
  • 4 in conversation