BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Kanyange
Fluorite | Level 6


Hi,

I would like to perform the Hosmer and Lemeshow test on my Validation set. I have managed to use it on the training that I have used to build the model, but really I want to run it on the validation sample (out of sample) to see if my model generalizes well....This below was my process for the Training and Output results....But I would like to know how to apply it to validation once I have scored it using the model created on the training. Your help will be much appreciated. Thank you..

proc logistic data=Xsell_File DESCENDING plots(only)=roc;

class VAR1

      VAR2

      VAR3

      VAR4

      VAR5

;

model Click_Flag = VAR1

      VAR2

      VAR3

      VAR4

      VAR5

/ selection=stepwise lackfit;

SCORE DATA=Validation OUT=Validation_Scores (RENAME=(P_1=p));

run ;

Partition for the Hosmer and Lemeshow Test
Click_Flag = 1Click_Flag = 0
GroupTotalObservedExpectedObservedExpected
109,236588600.528,6488635.48
99,664494496.839,1709167.17
89,668472442.139,1969225.87
79,656402403.199,2549252.81
69,716392374.069,3249341.94
59,711327346.889,3849364.12
49,612303319.569,3099292.44
39,665316297.079,3499367.93
29,665267271.169,3989393.84
110,047232241.619,8159805.39
Hosmer and Lemeshow Goodness-of-Fit
Test
Chi-SquareDFPr > ChiSq
7.0880.5281
1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

I think you might be able to pull this off by writing your final model parameter estimates to a dataset with option OUTEST= and then calling PROC LOGISTIC again with your validation data set with DATA=, bringing in your previous model with INEST=, preventing a new fit with MAXITER=0 and requesting H-L test with LACKFIT.

Good luck

PG

PG

View solution in original post

6 REPLIES 6
jwexler
SAS Employee

Hi, I just posted a similar response to Kanyange.  You would be best served posting this question in SAS STAT Community

I would consider this more of a stat than data mining question Smiley Happy

Thanks,

Jonathan

Kanyange
Fluorite | Level 6

Hi Jonathan,

I am quite confused ,  I thought that modelling is part of DataMining??? Also this test helps to validate the model...to see if your actual and predicted are actually similar....as far as I am aware this is datamining...

Thanks

Reeza
Super User

I think DataMining here refers to Enterprise Miner Software. Logistic regression is a portion of data mining though, and is part of the e-Miner software suite Smiley Happy

Kanyange
Fluorite | Level 6

Hi Reeza,

Thanks for your response..I am having troubles to get an answer on this, could you please help? Does EM have this test? Many thanks

PGStats
Opal | Level 21

I think you might be able to pull this off by writing your final model parameter estimates to a dataset with option OUTEST= and then calling PROC LOGISTIC again with your validation data set with DATA=, bringing in your previous model with INEST=, preventing a new fit with MAXITER=0 and requesting H-L test with LACKFIT.

Good luck

PG

PG
Kanyange
Fluorite | Level 6

Thank you very much PG, that's really helpful...Smiley Happy

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 3464 views
  • 0 likes
  • 4 in conversation