Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Stat Procs
- /
- ROC in SAS- obtaining a cut-off value

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 05-15-2014 09:53 PM
(55294 views)

Hi there,

I am interested in finding a cut-off value of a continuous variable in my dataset.

Has anyone performed ROC analyses in SAS to obtain a cut-off value.

After trying some variation in SAS code, I got the attached results, but I am not quite sure about how to interprete the results.

It would be helpful if you could help me with the code to obtain the criterion/cut-off value and the area under the curve.Is there any straihforward simple code/way to do this in SAS?

Below is the SAS code I tried:

ods graphics on;

**proc** **logistic** data=diabet.data plots(only)=(roc(id=obs) effect) descending;

model cadcat2=eatcm / scale=none

clparm=wald

clodds=pl

rsquare;

**run**;

ods graphics off;/

I tried SPSS too, but for some reason i kept getting the following warning:

*Text: 0,1 Command: ROC*

*An invalid numeric field has been found. The result has been set to the system-missing value.*

*Execution of this command stops.*

Any help would be greatly appreciated.

Thank you very much!

Ashwini

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Choosing a cutpoint depends on what criterion you want to use. The best tool for this is the CTABLE option in the MODEL statement. This option displays a table with statistics for each of a range of cutpoints such as the correct classification rate, false positive and negative rates, etc. If you want, you can change the cutpoints used in the table with the PPROB= option. For example, with this table you could find the cutpoint that maximizes the correct classification rate, or the cutpoint that satisfies your criteria for false positive and false negative rates.

There are other, follow on questions in this post. This response answers the original poster's question.

10 REPLIES 10

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Choosing a cutpoint depends on what criterion you want to use. The best tool for this is the CTABLE option in the MODEL statement. This option displays a table with statistics for each of a range of cutpoints such as the correct classification rate, false positive and negative rates, etc. If you want, you can change the cutpoints used in the table with the PPROB= option. For example, with this table you could find the cutpoint that maximizes the correct classification rate, or the cutpoint that satisfies your criteria for false positive and false negative rates.

There are other, follow on questions in this post. This response answers the original poster's question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

This is helpful. But I wonder how would I know the cut-point value of that continuous variable and the are under the curve.

Can i find that using any other option in the SAS code?

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thanks for the response! I did calculate the Youden's J statistics but I am not sure how to find out the corresponding cut-off value of the "interest variable- EATCM" Below I am pasting the part of the SAS output and the last column is the Youden's J stat and the first column is the probability. The max value of Youden's J is 154.6 and it corresponds to the 0.12 probability. From this information, how would I know the cut-off value of the interest variable?

Classification Table | ||||||||||

Prob | Correct | Incorrect | Percentages | |||||||

Level | Event | Non- | Event | Non- | Correct | Sensi- | Speci- | FALSE | FALSE | |

Event | Event | tivity | ficity | POS | NEG | J | ||||

0 | 33 | 0 | 328 | 0 | 9.1 | 100 | 0 | 90.9 | . | 99 |

0.02 | 33 | 62 | 266 | 0 | 26.3 | 100 | 18.9 | 89 | 0 | 117.9 |

0.04 | 31 | 162 | 166 | 2 | 53.5 | 93.9 | 49.4 | 84.3 | 1.2 | 142.3 |

0.06 | 26 | 209 | 119 | 7 | 65.1 | 78.8 | 63.7 | 82.1 | 3.2 | 141.5 |

0.08 | 26 | 236 | 92 | 7 | 72.6 | 78.8 | 72 | 78 | 2.9 | 149.8 |

0.1 | 25 | 254 | 74 | 8 | 77.3 | 75.8 | 77.4 | 74.7 | 3.1 | 152.2 |

0.12 | 24 | 272 | 56 | 9 | 82 | 72.7 | 82.9 | 70 | 3.2 | 154.6 |

0.14 | 20 | 280 | 48 | 13 | 83.1 | 60.6 | 85.4 | 70.6 | 4.4 | 145 |

0.16 | 17 | 285 | 43 | 16 | 83.7 | 51.5 | 86.9 | 71.7 | 5.3 | 137.4 |

0.18 | 16 | 291 | 37 | 17 | 85 | 48.5 | 88.7 | 69.8 | 5.5 | 136.2 |

0.2 | 16 | 301 | 27 | 17 | 87.8 | 48.5 | 91.8 | 62.8 | 5.3 | 139.3 |

0.22 | 14 | 304 | 24 | 19 | 88.1 | 42.4 | 92.7 | 63.2 | 5.9 | 134.1 |

0.24 | 12 | 306 | 22 | 21 | 88.1 | 36.4 | 93.3 | 64.7 | 6.4 | 128.7 |

0.26 | 10 | 307 | 21 | 23 | 87.8 | 30.3 | 93.6 | 67.7 | 7 | 122.9 |

0.28 | 9 | 309 | 19 | 24 | 88.1 | 27.3 | 94.2 | 67.9 | 7.2 | 120.5 |

0.3 | 9 | 314 | 14 | 24 | 89.5 | 27.3 | 95.7 | 60.9 | 7.1 | 122 |

0.32 | 8 | 318 | 10 | 25 | 90.3 | 24.2 | 97 | 55.6 | 7.3 | 120.2 |

0.34 | 4 | 320 | 8 | 29 | 89.8 | 12.1 | 97.6 | 66.7 | 8.3 | 108.7 |

0.36 | 4 | 320 | 8 | 29 | 89.8 | 12.1 | 97.6 | 66.7 | 8.3 | 108.7 |

0.38 | 4 | 320 | 8 | 29 | 89.8 | 12.1 | 97.6 | 66.7 | 8.3 | 108.7 |

0.4 | 4 | 321 | 7 | 29 | 90 | 12.1 | 97.9 | 63.6 | 8.3 | 109 |

0.42 | 3 | 323 | 5 | 30 | 90.3 | 9.1 | 98.5 | 62.5 | 8.5 | 106.6 |

0.44 | 2 | 323 | 5 | 31 | 90 | 6.1 | 98.5 | 71.4 | 8.8 | 103.6 |

0.46 | 2 | 323 | 5 | 31 | 90 | 6.1 | 98.5 | 71.4 | 8.8 | 103.6 |

0.48 | 2 | 323 | 5 | 31 | 90 | 6.1 | 98.5 | 71.4 | 8.8 | 103.6 |

0.5 | 2 | 324 | 4 | 31 | 90.3 | 6.1 | 98.8 | 66.7 | 8.7 | 103.9 |

0.52 | 2 | 325 | 3 | 31 | 90.6 | 6.1 | 99.1 | 60 | 8.7 | 104.2 |

0.54 | 2 | 325 | 3 | 31 | 90.6 | 6.1 | 99.1 | 60 | 8.7 | 104.2 |

0.56 | 2 | 325 | 3 | 31 | 90.6 | 6.1 | 99.1 | 60 | 8.7 | 104.2 |

0.58 | 1 | 325 | 3 | 32 | 90.3 | 3 | 99.1 | 75 | 9 | 101.1 |

0.6 | 1 | 326 | 2 | 32 | 90.6 | 3 | 99.4 | 66.7 | 8.9 | 101.4 |

0.62 | 1 | 326 | 2 | 32 | 90.6 | 3 | 99.4 | 66.7 | 8.9 | 101.4 |

0.64 | 1 | 326 | 2 | 32 | 90.6 | 3 | 99.4 | 66.7 | 8.9 | 101.4 |

0.66 | 0 | 326 | 2 | 33 | 90.3 | 0 | 99.4 | 100 | 9.2 | 98.4 |

0.68 | 0 | 326 | 2 | 33 | 90.3 | 0 | 99.4 | 100 | 9.2 | 98.4 |

0.7 | 0 | 326 | 2 | 33 | 90.3 | 0 | 99.4 | 100 | 9.2 | 98.4 |

0.72 | 0 | 326 | 2 | 33 | 90.3 | 0 | 99.4 | 100 | 9.2 | 98.4 |

0.74 | 0 | 327 | 1 | 33 | 90.6 | 0 | 99.7 | 100 | 9.2 | 98.7 |

0.76 | 0 | 327 | 1 | 33 | 90.6 | 0 | 99.7 | 100 | 9.2 | 98.7 |

0.78 | 0 | 327 | 1 | 33 | 90.6 | 0 | 99.7 | 100 | 9.2 | 98.7 |

0.8 | 0 | 327 | 1 | 33 | 90.6 | 0 | 99.7 | 100 | 9.2 | 98.7 |

0.82 | 0 | 327 | 1 | 33 | 90.6 | 0 | 99.7 | 100 | 9.2 | 98.7 |

0.84 | 0 | 327 | 1 | 33 | 90.6 | 0 | 99.7 | 100 | 9.2 | 98.7 |

0.86 | 0 | 327 | 1 | 33 | 90.6 | 0 | 99.7 | 100 | 9.2 | 98.7 |

0.88 | 0 | 327 | 1 | 33 | 90.6 | 0 | 99.7 | 100 | 9.2 | 98.7 |

0.9 | 0 | 328 | 0 | 33 | 90.9 | 0 | 100 | . | 9.1 | 99 |

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

First of all, the formula is true positive PROPORTION + true negative PROPORTION -1. You used percentages, but kept the "1". This is not correct. If you work in percentages, then

J = true positive PERCENTAGE + true negative PERCENTAGE - 100.

Assuming that you get the maximum at p=0.12 (check this), this is what you do. You use the fitted logit equation from your output:

logit(.12) = -4.71 + 0.0231X,

where X is your predictor. This gives:

-1.99 = -4.71 + 0.0231X, or

X = (-1.99 + 4.71)/0.0231 = 117.6. This is your cut-off.

Looking at your fitted curve at the end of your output, this looks about right.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

@ lvm: this is mighty helpful!. Thanks a lot! Based on the youden's J index and the above calcualtions (that you've provided), the cut-off value does match with the clinically relevant cut-off value, which I had used earlier in my analyes. The reviewers had asked for a statistical justification to substantiate the cut-off based on ROC analyses, and I am happy that I have it now.

Gretaly appreciate your time and help!

Ashwini

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hi

I am also trying to back calculate a particular conconcentration using Youden's Index. However, I have been using a different software package as I have a continuous dataset. Could you please explain how you derived the values -4.71 and 0.0231, i.e. what varables have been used in a regression. Is this something I could do manually by graphing respective variable in excel and getting a trend line and regression equation. I too tried to manually generate the value -1.99 and the closest I got was -1.7, is this a natural log or log to some base value? Thanks for any assistance you are able to provide.

Regards

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I am facing a similar issue and found the above helpful. I have a couple of follow up questions if you don't mind (for both Ashwini_uci and lvm):

- Is the cut point I now get model-based vs. observed? In other words, are we in essence back-calculating the cut point(s) as opposed to being able to match each observation in the OUTROC dataset back to the observed data that were tested to produced the sensitivity and specificity in the first place?
- Noticed log(e) was used to calculate the logit. Any reason for selecting that base vs log(10) instead?

Also just to add a possible solution if not just looking for a single "best" cut point, I received this code from a SAS consultant who I was connected with when I contacted SAS Support directly:

Hi David:

If you have just a single X variable then you could solve for it. Something like this would work:

data test;

seed=2534565;

do i=1 to 20;

x1=ranuni(seed);

logit=-2 + 2*x1;

p=exp(-logit)/(1+exp(-logit));

if ranuni(seed)>p then y=1; else y=0;

output;

end;

drop seed i p logit;

run;

proc logistic data=test outest=parms(keep=intercept x1);

model y=x1/ outroc=out1;

run;

data cutpoints;

set out1;

if _n_=1 then set parms;

x1_cut=(log(_prob_/(1-_prob_))-intercept)/x1;

proc print data=cutpoints;

var _sensit_ _1MSPEC_ _prob_ x1_cut;

run;

Sincerely,

Rob Agnelli

Principal Technical Support Statistician

SAS

Thanks,

David

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hi Dverbel,

Is the above formula for getting the cut-off value is correct.? I am not sure about the formula.

Is the above formula for getting the cut-off value is correct.? I am not sure about the formula.

Are you ready for the spotlight? We're accepting content ideas for **SAS Innovate 2025** to be held May 6-9 in Orlando, FL. The call is **open **until September 16. Read more here about **why** you should contribute and **what is in it** for you!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.