SAS Programming

whs278 · Posted 06-29-2021 01:06 PM

I'm trying to calculate the AUC on a holdout test (or validation) data set. My model has a pretty good AUC on the training data (.87), but I would like to see if it performs well out of sample.

Let's say original datset contains three variables Y, X1, and X2. I split this dataset into two smaller datasets: XTRAIN and XTEST.

These are the steps I have done.

First I trained my model on the training dataset XTRAIN

proc logistic data = XTRAIN outmodel=  MODEL1   ;
	model Y (EVENT = '1')=  X1 X2 ;
run;

Next I use my model to make predictions on the test dataset.

proc logistic inmodel = MODEL1 ;
    score data = XTEST out = YPRED_test (rename = (P_1 = YPRED));
run;

Next I use these predictions to plot ROC and calculate my test AUC

proc logistic data= YPRED;
        model Y(event="1")=;
        roc pred =YPRED;
       ods select ROCOVERLAY;
 run;

I just wanted to check if these steps were correct. In general, these are the steps for out-of-sample model validation I have used when programming in R and Python.

whs278 · Posted 06-29-2021 02:19 PM

Okay, re-read and now realized where I was getting confused. Just had a hard time understanding that you could fit the model and calculate prediction scores for different datasets in the same PROC LOGISTIC step.

This is ultimately the fastest way to compare training/test AUC.

    proc logistic data=train;
        model y(event="1") = x1 x2;
        score data=train fitstat;
        score data=valid fitstat;
        run;

.

View solution in original post

PaigeMiller · Posted 06-29-2021 01:41 PM

https://support.sas.com/kb/39/724.html

--
Paige Miller

whs278 · Posted 06-29-2021 02:09 PM

Thanks, I actually read that but was confused on two points.

First it seems that having INMODEL as a separate step is unnecessary because I can just add a score statement to the first PROC LOGISTIC step. I just wanted to confirm that that is correct.

proc logistic data = XTRAIN outmodel= MODEL1 ;
model Y (EVENT = '1')= X1 X2 ;
score data = XTEST out =YTEST (rename = (P_1 = YPRED));
run;

Second I don't understand why you need the model statement in the second PROC LOGISTIC step since you already fit the model in the first step.

proc logistic data= YTEST;
model Y(event="1")=;
roc pred =YPRED;
ods select ROCOVERLAY;
run;

PaigeMiller · Posted 06-29-2021 02:19 PM

That link is pretty old, and maybe there are newer versions of code that do the same (or maybe there were always two ways to do this).

You can try it both ways and see if the results are the same. You can also try the second piece of code without the MODEL statement and see what happens.

--
Paige Miller

whs278 · Posted 06-29-2021 02:19 PM

Okay, re-read and now realized where I was getting confused. Just had a hard time understanding that you could fit the model and calculate prediction scores for different datasets in the same PROC LOGISTIC step.

This is ultimately the fastest way to compare training/test AUC.

    proc logistic data=train;
        model y(event="1") = x1 x2;
        score data=train fitstat;
        score data=valid fitstat;
        run;

.

Ksharp · Posted 06-30-2021 09:11 AM

proc logistic data=sashelp.heart;
model status(event='Dead')=weight height/nofit;
roc 'weight' pred=weight;
roc 'height' pred=height;
run;

SAS Programming

Calculating AUC on test/validation dataset

Re: Calculating AUC on test/validation dataset

Re: Calculating AUC on test/validation dataset

Re: Calculating AUC on test/validation dataset

Re: Calculating AUC on test/validation dataset

Re: Calculating AUC on test/validation dataset

Re: Calculating AUC on test/validation dataset

Follow Us

What is...

SAS Programming

Calculating AUC on test/validation dataset

Re: Calculating AUC on test/validation dataset

Re: Calculating AUC on test/validation dataset

Re: Calculating AUC on test/validation dataset

Re: Calculating AUC on test/validation dataset

Re: Calculating AUC on test/validation dataset

Re: Calculating AUC on test/validation dataset

Special offer for SAS Communities members

SAS Training: Just a Click Away

Follow Us

What is...