BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Kanyange
Fluorite | Level 6


Hi,

I would like to perform the Hosmer and Lemeshow test on my Validation set. I have managed to use it on the training that I have used to build the model, but really I want to run it on the validation sample (out of sample) to see if my model generalizes well....This below was my process for the Training and Output results....But I would like to know how to apply it to validation once I have scored it using the model created on the training. Your help will be much appreciated. Thank you..

proc logistic data=Xsell_File DESCENDING plots(only)=roc;

class VAR1

      VAR2

      VAR3

      VAR4

      VAR5

;

model Click_Flag = VAR1

      VAR2

      VAR3

      VAR4

      VAR5

/ selection=stepwise lackfit;

SCORE DATA=Validation OUT=Validation_Scores (RENAME=(P_1=p));

run ;

Partition for the Hosmer and Lemeshow Test
Click_Flag = 1Click_Flag = 0
GroupTotalObservedExpectedObservedExpected
109,236588600.528,6488635.48
99,664494496.839,1709167.17
89,668472442.139,1969225.87
79,656402403.199,2549252.81
69,716392374.069,3249341.94
59,711327346.889,3849364.12
49,612303319.569,3099292.44
39,665316297.079,3499367.93
29,665267271.169,3989393.84
110,047232241.619,8159805.39
Hosmer and Lemeshow Goodness-of-Fit
Test
Chi-SquareDFPr > ChiSq
7.0880.5281
1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

I think you might be able to pull this off by writing your final model parameter estimates to a dataset with option OUTEST= and then calling PROC LOGISTIC again with your validation data set with DATA=, bringing in your previous model with INEST=, preventing a new fit with MAXITER=0 and requesting H-L test with LACKFIT.

Good luck

PG

PG

View solution in original post

6 REPLIES 6
jwexler
SAS Employee

Hi, I just posted a similar response to Kanyange.  You would be best served posting this question in SAS STAT Community

I would consider this more of a stat than data mining question Smiley Happy

Thanks,

Jonathan

Kanyange
Fluorite | Level 6

Hi Jonathan,

I am quite confused ,  I thought that modelling is part of DataMining??? Also this test helps to validate the model...to see if your actual and predicted are actually similar....as far as I am aware this is datamining...

Thanks

Reeza
Super User

I think DataMining here refers to Enterprise Miner Software. Logistic regression is a portion of data mining though, and is part of the e-Miner software suite Smiley Happy

Kanyange
Fluorite | Level 6

Hi Reeza,

Thanks for your response..I am having troubles to get an answer on this, could you please help? Does EM have this test? Many thanks

PGStats
Opal | Level 21

I think you might be able to pull this off by writing your final model parameter estimates to a dataset with option OUTEST= and then calling PROC LOGISTIC again with your validation data set with DATA=, bringing in your previous model with INEST=, preventing a new fit with MAXITER=0 and requesting H-L test with LACKFIT.

Good luck

PG

PG
Kanyange
Fluorite | Level 6

Thank you very much PG, that's really helpful...Smiley Happy

sas-innovate-2026-white.png



April 27 – 30 | Gaylord Texan | Grapevine, Texas

Registration is open

Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!

Register now

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 4879 views
  • 0 likes
  • 4 in conversation