Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

In k fold CV how use trained model (logistic regression) to compare test data set and draw ROC

Accepted Solution Solved
Reply
New Contributor
Posts: 3
Accepted Solution

In k fold CV how use trained model (logistic regression) to compare test data set and draw ROC

[ Edited ]

Hi I am new to SAS and need suggestions on my  codes.

 

I wanted to compare two logistic regression models. I tried to use 10-fold cross-validation for both models. I first searched online and find below code which can generate different folds and run logistic regression on my training data sets. But I do not know how I can use the training model to predict and compare my test data set. If possible, I wanted to draw ROC and calculate the AUC.

 

I attached current codes here. Please help me for my next step. Thanks very much!

 

%let K=10;

%let rate=%sysevalf((&K-1)/&K);

*Build model with all data;

proc logistic data=cold;

model y(event="1")=temp time time*temp;

run;

*Generate the cross validation sample;

proc surveyselect data=cold out=cv seed=231258

samprate=&rate outall reps=10;

run;

/* the variable selected is an automatic variable generatic by surveyselect.*/

/*If selected is true then then new_y will get the value of y otherwise is missing */

data cv;

set cv;

if selected then new_y=y;

run;

/* get predicted values for the missing new_y in each replicate */

ods output ParameterEstimates=ParamEst;

proc logistic data=cv;

model new_y(event="1")=temp time time*temp/outroc=roc;

by replicate;

output out=out1(where=(new_y=.)) predicted=y_hat;

run;


Accepted Solutions
Solution
‎11-16-2016 01:49 PM
Super User
Posts: 10,041

Re: In k fold CV how use trained model (logistic regression) to compare test data set and draw ROC

Use SCORE statement to predict test data.

https://communities.sas.com/t5/SAS-IML-Software-and-Matrix/How-to-do-k-fold-CV-with-replacements-replication/m-p/303938#U303938


Calculate AUC.

http://blogs.sas.com/content/iml/2011/07/29/computing-an-roc-curve-from-basic-principles.html

http://blogs.sas.com/content/iml/2011/06/03/a-statistical-application-of-numerical-integration-the-area-under-an-roc-curve.html

http://blogs.sas.com/content/iml/2011/07/08/the-area-under-a-density-estimate-curve-nonparametric-estimates.html

http://blogs.sas.com/content/iml/2011/05/27/obtaining-area-from-a-set-of-points-on-a-curve.html

View solution in original post


All Replies
Super User
Posts: 19,855

Re: In k fold CV how use trained model (logistic regression) to compare test data set and draw ROC

What exactly is your final model though? 

 

You ran a regression for each slice so you have multiple models so to speak. How are you planning to finalize that model? 

New Contributor
Posts: 3

Re: In k fold CV how use trained model (logistic regression) to compare test data set and draw ROC

Thanks very much for you reply. Links in another reply helped me find the answer (drawing ROC and calculating AUC). But since I am still new in SAS and STAT, I was wondering if you think below procedure has no problems. 

 

Sorry for my post seemed a little confusing. I wanted to compare two different models for their predictive power. Because the response is binary, I used logistic regression. So I used 10 fold cross validation to evaluate the two models. For each model, I used 9 folds of data to train the model and then compare the held data to draw the ROC. I was wondering AUC can be a good indicator for the evaluation of the two models. 

 

Thanks again!

 

Solution
‎11-16-2016 01:49 PM
Super User
Posts: 10,041

Re: In k fold CV how use trained model (logistic regression) to compare test data set and draw ROC

Use SCORE statement to predict test data.

https://communities.sas.com/t5/SAS-IML-Software-and-Matrix/How-to-do-k-fold-CV-with-replacements-replication/m-p/303938#U303938


Calculate AUC.

http://blogs.sas.com/content/iml/2011/07/29/computing-an-roc-curve-from-basic-principles.html

http://blogs.sas.com/content/iml/2011/06/03/a-statistical-application-of-numerical-integration-the-area-under-an-roc-curve.html

http://blogs.sas.com/content/iml/2011/07/08/the-area-under-a-density-estimate-curve-nonparametric-estimates.html

http://blogs.sas.com/content/iml/2011/05/27/obtaining-area-from-a-set-of-points-on-a-curve.html
New Contributor
Posts: 3

Re: In k fold CV how use trained model (logistic regression) to compare test data set and draw ROC

Thanks very much! Your links help me find ways to calculate the AUC. This is awesome!

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 1072 views
  • 0 likes
  • 3 in conversation