Programming the statistical procedures from SAS

ROC curve for hold out sample

Reply
Occasional Contributor
Posts: 10

ROC curve for hold out sample

Hi all,

 

I want to minimize the euclidean distance between the point (0,1) and my ROC curve. I have trained my logistic model in the set train2007 and want to test the model on the set pred2008. I have tried this code:

 

 

proc logistic data=sasdata.train2007;
model flag(event="1") = TL_TA EAT_TA / CTABLE outroc=troc;
score data=sasdata.pred2008 out=valpred outroc=vroc;
roc; roccontrast;
run;

 

 

the thing is, CTABLE only gives me the misclassification for the train2007 dataset. I want to find the misclassification for the pred2008 dataset and find the optimal cut off point by minimizing the euclidean distance in that dataset.

Grand Advisor
Posts: 9,458

Re: ROC curve for hold out sample

Since you can get ROC table , you can get that cut off point ( make _FALPOS_ and _FALNEG_ as small as it could)

 

Obs _PROB_ _POS_ _NEG_ _FALPOS_ _FALNEG_ _SENSIT_ _1MSPEC_
1 0.97674 1 10 0 8 0.11111 0.0
2 0.88438 2 10 0 7 0.22222 0.0
3 0.86057 3 10 0 6 0.33333 0.0
4 0.77359 4 10 0 5 0.44444 0.0
5 0.75478 4 9 1 5 0.44444 0.1
6 0.70552 5 9 1 4 0.55556 0.1
7 0.59623 6 9 1 3 0.66667 0.1
8 0.58251 7 8 2 2 0.77778 0.2
9 0.57043 7 7 3 2 0.77778 0.3
10 0.56984 7 6 4 2 0.77778 0.4
11 0.42628 8 6 4 1 0.88889 0.4
12 0.23915 9 6 4 0 1.00000 0.4
13 0.14442 9 5 5 0 1.00000 0.5
14 0.13271 9 4 6 0 1.00000 0.6
15 0.10293 9 3 7 0 1.00000 0.7
16 0.07392 9 2 8 0 1.00000 0.8
17 0.02239 9 1 9 0 1.00000 0.9
18 0.00109 9 0 10 0 1.00000 1.0
Occasional Contributor
Posts: 10

Re: ROC curve for hold out sample

Thank you. But the ctable gives me the sensitivity and specifity for train2007. But i guess you mean that i should do that for the vroc file, right?

Occasional Contributor
Posts: 10

Re: ROC curve for hold out sample

[ Edited ]

And one more thing. My first observation has the lowest mis classifikation, it has only false negatives of 1000. That means that the optimal cuttoff would be in (0,0). However, my ROC curve looks like the attached document. We can see from the curve that thats now where the least euclidean from  (0,1) to the curve is not (0,0). I thought the minimum euclidean distance would be where the minimum misclassification of both false positives and false negatives is? And from the ROC curve, thats not the point (0,0)

Attachment
Grand Advisor
Posts: 9,458

Re: ROC curve for hold out sample

OUTROC table also could give you sensitity and 1-specitify

I don't understand your euclidean distance here. You mean the point (slope=1)?

 

Obs _PROB_ _POS_ _NEG_ _FALPOS_ _FALNEG_ _SENSIT_ _1MSPEC_
Occasional Contributor
Posts: 10

Re: ROC curve for hold out sample

Actually, i think I'm just misunderstanding exactly what the minimum euclidean distance from (0,1) to the ROC curve is saying. The articles say it's a way of finding an optimal cut off point. I though the minimum euclidean distance cutoff point gave me where both false negatives and false positves is at a minimum, but thats not what the minimum euclidean distance gives me; i just want to know what it is it gives me then?

 

 

 

Grand Advisor
Posts: 9,458

Re: ROC curve for hold out sample

[ Edited ]

I think you mean the point which has slope=1.

You can get it by calculated slope .

slope=(y2-y1)/(x2-x1)

 

Here y is sensitity, x is 1-specitity.

 

you can calculated these slope by the two obs next to each other.

when you see slope > 1 and after slope < 1, then you can get slope=1 point.

Ask a Question
Discussion stats
  • 6 replies
  • 104 views
  • 0 likes
  • 2 in conversation