BookmarkSubscribeRSS Feed
yocrachi
Fluorite | Level 6

Hi all,

 

I want to minimize the euclidean distance between the point (0,1) and my ROC curve. I have trained my logistic model in the set train2007 and want to test the model on the set pred2008. I have tried this code:

 

 

proc logistic data=sasdata.train2007;
model flag(event="1") = TL_TA EAT_TA / CTABLE outroc=troc;
score data=sasdata.pred2008 out=valpred outroc=vroc;
roc; roccontrast;
run;

 

 

the thing is, CTABLE only gives me the misclassification for the train2007 dataset. I want to find the misclassification for the pred2008 dataset and find the optimal cut off point by minimizing the euclidean distance in that dataset.

6 REPLIES 6
Ksharp
Super User

Since you can get ROC table , you can get that cut off point ( make _FALPOS_ and _FALNEG_ as small as it could)

 

Obs _PROB_ _POS_ _NEG_ _FALPOS_ _FALNEG_ _SENSIT_ _1MSPEC_
1 0.97674 1 10 0 8 0.11111 0.0
2 0.88438 2 10 0 7 0.22222 0.0
3 0.86057 3 10 0 6 0.33333 0.0
4 0.77359 4 10 0 5 0.44444 0.0
5 0.75478 4 9 1 5 0.44444 0.1
6 0.70552 5 9 1 4 0.55556 0.1
7 0.59623 6 9 1 3 0.66667 0.1
8 0.58251 7 8 2 2 0.77778 0.2
9 0.57043 7 7 3 2 0.77778 0.3
10 0.56984 7 6 4 2 0.77778 0.4
11 0.42628 8 6 4 1 0.88889 0.4
12 0.23915 9 6 4 0 1.00000 0.4
13 0.14442 9 5 5 0 1.00000 0.5
14 0.13271 9 4 6 0 1.00000 0.6
15 0.10293 9 3 7 0 1.00000 0.7
16 0.07392 9 2 8 0 1.00000 0.8
17 0.02239 9 1 9 0 1.00000 0.9
18 0.00109 9 0 10 0 1.00000 1.0
yocrachi
Fluorite | Level 6

Thank you. But the ctable gives me the sensitivity and specifity for train2007. But i guess you mean that i should do that for the vroc file, right?

yocrachi
Fluorite | Level 6

And one more thing. My first observation has the lowest mis classifikation, it has only false negatives of 1000. That means that the optimal cuttoff would be in (0,0). However, my ROC curve looks like the attached document. We can see from the curve that thats now where the least euclidean from  (0,1) to the curve is not (0,0). I thought the minimum euclidean distance would be where the minimum misclassification of both false positives and false negatives is? And from the ROC curve, thats not the point (0,0)


ROC curve.jpg
Ksharp
Super User

OUTROC table also could give you sensitity and 1-specitify

I don't understand your euclidean distance here. You mean the point (slope=1)?

 

Obs _PROB_ _POS_ _NEG_ _FALPOS_ _FALNEG_ _SENSIT_ _1MSPEC_
yocrachi
Fluorite | Level 6

Actually, i think I'm just misunderstanding exactly what the minimum euclidean distance from (0,1) to the ROC curve is saying. The articles say it's a way of finding an optimal cut off point. I though the minimum euclidean distance cutoff point gave me where both false negatives and false positves is at a minimum, but thats not what the minimum euclidean distance gives me; i just want to know what it is it gives me then?

 

 

 

Ksharp
Super User

I think you mean the point which has slope=1.

You can get it by calculated slope .

slope=(y2-y1)/(x2-x1)

 

Here y is sensitity, x is 1-specitity.

 

you can calculated these slope by the two obs next to each other.

when you see slope > 1 and after slope < 1, then you can get slope=1 point.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 1667 views
  • 0 likes
  • 2 in conversation