BookmarkSubscribeRSS Feed
buski
Calcite | Level 5

Hi,

I would like to know how can I take an new dataset to validate an established model.

For example: If I have a model to investigate the relationships between health events and climatic variables in New York, how to use this model to fit the different region with the same variables?

I know it's easy to do in R but I have no idea how to do it in SAS.

Thanks a lot!

9 REPLIES 9
Ksharp
Super User

How about to use 'by region' statement in your proc to fit the different region?

Ksharp

buski
Calcite | Level 5

Thanks,

But I think "by region" will give me two set of different models...not model validation 

Doc_Duke
Rhodochrosite | Level 12

With logistic regression, you can fit an existing model to new data; the syntax is in the manual.  Then you can do something like a Hosmer-Lemeshow test to check the goodness of fit of the existing model to the new data.

Doc Muhlbaier

Duke

buski
Calcite | Level 5

Unfortunately, I used  "PROC PDL" function and the predprobs=x option does not apply.

art297
Opal | Level 21

But can't you still use the STB option and proc score and determine how much variance you account for with each by variable?

buski
Calcite | Level 5

I don't think that use STB option and Proc SCORE is equivalent to the model validation.

I am not going to evaluate the importance of different variables.

What I want to do is to use a new dataset to validate the existing model.

Ksharp
Super User

Model validation is dependent on different Estimator.

Like F estimator for proc reg and AIC for other model.

I am not sure whether I understood your Model validation.

Doc_Duke
Rhodochrosite | Level 12

Buski,

I've not used PROC PDLREG, so I'm basing these comments on it's documentation.  It seems that PDLREG is linear regression with the lag variables from the time series included as an orthogonal polynomial.  If my understanding is correct, then a straightforward way to get a handle on the adequacy of the model for a new population is to compute the regression estimate (y-hat) on the new data using the existing model and then use that as a single covariate in a new model that is otherwise specified exactly as the original one was.  If the model is adequate, then the new coefficients will not be significant.  To the extent that they are significant (beyond that expected from type I randomness), there is evidence of inadequacy. 

This is not a nice clean single number as a "score", but it can be quite helpful in figuring out where the model needs work.

Doc Muhlbaier

Duke

Rashu
Calcite | Level 5

Buski, I was wondering if you figure out the way to validate a model. I am trying to do the same and having a hard time. I am trying to validate using ROC and calibration. I know how to get ROC on its own but I can't seem to figure out how to include the coefficients. Please let me know if you figured out. Thanks

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 9 replies
  • 6324 views
  • 0 likes
  • 5 in conversation