BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Zinan
Calcite | Level 5

Hi I am new to SAS and need suggestions on my  codes.

 

I wanted to compare two logistic regression models. I tried to use 10-fold cross-validation for both models. I first searched online and find below code which can generate different folds and run logistic regression on my training data sets. But I do not know how I can use the training model to predict and compare my test data set. If possible, I wanted to draw ROC and calculate the AUC.

 

I attached current codes here. Please help me for my next step. Thanks very much!

 

%let K=10;

%let rate=%sysevalf((&K-1)/&K);

*Build model with all data;

proc logistic data=cold;

model y(event="1")=temp time time*temp;

run;

*Generate the cross validation sample;

proc surveyselect data=cold out=cv seed=231258

samprate=&rate outall reps=10;

run;

/* the variable selected is an automatic variable generatic by surveyselect.*/

/*If selected is true then then new_y will get the value of y otherwise is missing */

data cv;

set cv;

if selected then new_y=y;

run;

/* get predicted values for the missing new_y in each replicate */

ods output ParameterEstimates=ParamEst;

proc logistic data=cv;

model new_y(event="1")=temp time time*temp/outroc=roc;

by replicate;

output out=out1(where=(new_y=.)) predicted=y_hat;

run;

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
Use SCORE statement to predict test data.

https://communities.sas.com/t5/SAS-IML-Software-and-Matrix/How-to-do-k-fold-CV-with-replacements-replication/m-p/303938#U303938


Calculate AUC.

http://blogs.sas.com/content/iml/2011/07/29/computing-an-roc-curve-from-basic-principles.html

http://blogs.sas.com/content/iml/2011/06/03/a-statistical-application-of-numerical-integration-the-area-under-an-roc-curve.html

http://blogs.sas.com/content/iml/2011/07/08/the-area-under-a-density-estimate-curve-nonparametric-estimates.html

http://blogs.sas.com/content/iml/2011/05/27/obtaining-area-from-a-set-of-points-on-a-curve.html

View solution in original post

4 REPLIES 4
Reeza
Super User

What exactly is your final model though? 

 

You ran a regression for each slice so you have multiple models so to speak. How are you planning to finalize that model? 

Zinan
Calcite | Level 5

Thanks very much for you reply. Links in another reply helped me find the answer (drawing ROC and calculating AUC). But since I am still new in SAS and STAT, I was wondering if you think below procedure has no problems. 

 

Sorry for my post seemed a little confusing. I wanted to compare two different models for their predictive power. Because the response is binary, I used logistic regression. So I used 10 fold cross validation to evaluate the two models. For each model, I used 9 folds of data to train the model and then compare the held data to draw the ROC. I was wondering AUC can be a good indicator for the evaluation of the two models. 

 

Thanks again!

 

Ksharp
Super User
Use SCORE statement to predict test data.

https://communities.sas.com/t5/SAS-IML-Software-and-Matrix/How-to-do-k-fold-CV-with-replacements-replication/m-p/303938#U303938


Calculate AUC.

http://blogs.sas.com/content/iml/2011/07/29/computing-an-roc-curve-from-basic-principles.html

http://blogs.sas.com/content/iml/2011/06/03/a-statistical-application-of-numerical-integration-the-area-under-an-roc-curve.html

http://blogs.sas.com/content/iml/2011/07/08/the-area-under-a-density-estimate-curve-nonparametric-estimates.html

http://blogs.sas.com/content/iml/2011/05/27/obtaining-area-from-a-set-of-points-on-a-curve.html
Zinan
Calcite | Level 5

Thanks very much! Your links help me find ways to calculate the AUC. This is awesome!

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 4208 views
  • 0 likes
  • 3 in conversation