BookmarkSubscribeRSS Feed
yocrachi
Fluorite | Level 6

Hi all,

 

i want to make a ROC curve for an hold out sample. The thing is, i made a logistic regression for some data i have from the year 2007 and I want to see how this model fits the data in the year 2008. I can't use this code:

 


proc logistic data = sasdata.Data2008;
model flag(event='1')=TL_TA EAT_TA AGE /outroc=r;
run;

 

 

because then my model and my ROC curve is based on a logistic regression on the 2008 dataset. I want to do a logistc regression on the 2007 set, and then use this fit to see how it fits the 2008 data set. So i tried this:

 

 

 

proc logistic data=sasdata.data2007;
class AGE (ref='Ny') / param = ref;
model flag(event="1") = TL_TA EAT_TA AGE / CTABLE outroc=troc;
score data=sasdata.data2008 out=valpred outroc=vroc;
roc; roccontrast;
run;

 

 

This seems ok. I get a ROC curve both for the fit of 2007 and then a ROC curve for how the 2007 model fits on the 2008 model. The thing is, i want to find the optimale cutoff point in 2008, where the euclidean distance from 1.0 is minimized to the ROC curve, how can i do that? The ctable option gives me the predicted probabilities for the 2007 data set only.

 

I hope you can help.

1 REPLY 1
StatDave
SAS Super FREQ

See the ROCPLOT macro. Specify the SCORE OUTROC= data set in the INROC= macro option, and the SCORE OUT= data set and its predicted probabilities in the macro's INPRED= and P= options.  See the macro documentation for information on the various optimality criteria you can use.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 1442 views
  • 2 likes
  • 2 in conversation