turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Data Mining
- /
- In k fold CV how use trained model (logistic regre...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-16-2016 02:41 AM - edited 11-16-2016 02:42 AM

Hi I am new to SAS and need suggestions on my codes.

I wanted to compare two logistic regression models. I tried to use 10-fold cross-validation for both models. I first searched online and find below code which can generate different folds and run logistic regression on my training data sets. But I do not know how I can use the training model to predict and compare my test data set. If possible, I wanted to draw ROC and calculate the AUC.

I attached current codes here. Please help me for my next step. Thanks very much!

%let K=**10****;**

%let rate=%sysevalf**((**&K-**1****)**/&K**);**

*Build model with all data;

**proc** **logistic** **data**=cold**;**

model y**(**event="1"**)**=temp time time*temp**;**

**run****;**

*Generate the cross validation sample;

**proc** **surveyselect** **data**=cold out=cv seed=**231258**

samprate=&rate outall reps=**10****;**

**run****;**

/* the variable selected is an automatic variable generatic by surveyselect.*/

/*If selected is true then then new_y will get the value of y otherwise is missing */

**data** cv**;**

set cv**;**

if selected then new_y=y**;**

**run****;**

/* get predicted values for the missing new_y in each replicate */

ods output ParameterEstimates=ParamEst**;**

**proc** **logistic** **data**=cv**;**

model new_y**(**event="1"**)**=temp time time*temp/outroc=roc**;**

by replicate**;**

output out=out1**(**where=**(**new_y=**.****))** predicted=y_hat**;**

**run****;**

Accepted Solutions

Solution

11-16-2016
01:49 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Zinan

11-16-2016 09:43 AM

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Zinan

11-16-2016 03:46 AM

What exactly is your final model though?

You ran a regression for each slice so you have multiple models so to speak. How are you planning to finalize that model?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

11-16-2016 01:49 PM

Thanks very much for you reply. Links in another reply helped me find the answer (drawing ROC and calculating AUC). But since I am still new in SAS and STAT, I was wondering if you think below procedure has no problems.

Sorry for my post seemed a little confusing. I wanted to compare two different models for their predictive power. Because the response is binary, I used logistic regression. So I used 10 fold cross validation to evaluate the two models. For each model, I used 9 folds of data to train the model and then compare the held data to draw the ROC. I was wondering AUC can be a good indicator for the evaluation of the two models.

Thanks again!

Solution

11-16-2016
01:49 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Zinan

11-16-2016 09:43 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Ksharp

11-16-2016 01:42 PM

Thanks very much! Your links help me find ways to calculate the AUC. This is awesome!