BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
timothy19
Fluorite | Level 6

Hello Everyone,

I am trying to calculate an optimal cutoff point. I know about using the Youden's Index but it's not valid for data with repeated measurement. I will be glad if anyone could help with how I could create an optimal cutoff point for repeated data.
Find below is the dataset

data have;
infile datalines truncover;
input ID Position $ Time Resistance Force y;
datalines;
1   Ant 3.82    -23 37  1
1   Ant 4.94    -24 4.4 0
1   Ant 3.49    -41.5   15.8    1
1   Ant 3.07    -39.5   21.15   1
1   Post    3.43    -39 29  1
1   Post    3.53    -45.5   14.15   1
1   Post    4.44    -46 9.55    1
1   Post    3.37    -19 12.9    1
1   Ant 3.35    -46.5   7.2 1
1   Ant 3.19    -43 14.2    1
1   Ant 3.61    -41 24.55   1
1   Ant 4.24    -48 23.15   1
1   Post    2.09    -33 27.25   1
1   Post    2.83    -32 21.2    1
1   Post    3.26    -28 29.7    0
1   Post    3.34    -41.5   10.15   1
2   Ant 5.29    -30 10.9    1
2   Ant 4.22    -29 11.3    1
2   Ant 2.40    -15 10.2    1
2   Ant 3.32    -18.5   8.75    1
2   Post    1.85    -27 9.3 1
2   Post    4.31    -37 8.75    1
2   Post    1.82    -29.5   13.95   1
2   Post    3.98    -24.5   9.8 0
2   Ant 4.06    -39.5   21.45   1
2   Ant 2.90    -35.5   16.5    1
2   Ant 3.23    -35.5   18.2    1
2   Ant 4.24    -31 14.3    1
2   Post    2.45    -31 9.6 1
2   Post    3.51    -20 6   1
2   Post    4.27    -17.5   8.4 1
2   Post    2.67    -25.5   25  0
3   Ant 3.065092996 -40 10.7    1
3   Ant 3.74    -38 17.8    .
3   Ant 3.61    -27 10.1    0
3   Ant 2.08    -26.5   6.45    .
3   Post    2.12    -35 20.4    1
3   Post    3.244   -39 27.5    1
3   Post    4.02    -42 19.9    1
3   Post    1.94    -19 16.6    1
3   Ant 4.37    -14 4.2 0
3   Ant 4.68    -33 6.9 0
3   Ant 3.35    -30.5   8.65    1
3   Ant 1.72    -33 14.1    1
3   Post    0.81    -27 12.2    1
3   Post    3.90    -26 18.35   1
3   Post    4.19    -29 9.4 1
3   Post    4.46    -19 10.2    1
4   Ant 2.89    -42 11.3    1
4   Ant 2.20    -28 12.45   1
4   Ant 2.97    -31 19.5    1
4   Ant 2.06    -31 22.3    1
4   Post    3.35    -44.5   32.9    1
4   Post    2.10    -35 15.3    1
4   Post    3.42    -35 8.35    1
4   Post    4.16    -33 20.9    1
4   Ant 4.06    -15.5   6   1
4   Ant 5.00    -25 21.5    1
4   Ant 4.13    -33.5   24.25   1
4   Ant 5.56    -34 16.7    1
4   Post    4.14    -35 31.75   1
4   Post    4.49    -33.5   25.4    1
4   Post    4.17    -29 41.9    1
4   Post    3.85    -28 28.8    1
5   Ant 2.67    -23 28.2    0
5   Ant 1.68    -23 10.3    1
5   Ant 2.07    -19.5   9.85    1
5   Ant 1.06    -25 12.7    1
5   Post    5.02    -31 10.4    0
5   Post    2.53    -23 11.7    1
5   Post    3.40    -71.5   27.15   1
5   Post    4.78    -41.5   31.85   1
5   Ant 2.15    -42 15.95   1
5   Ant 2.89    -26.5   16.9    1
5   Ant 2.25    -33 9.8 1
5   Ant 2.76    -28 12.1    0
5   Post    3.04    -22 9   1
5   Post    4.45    -37.5   9.25    1
5   Post    4.10    -20.5   15.3    1
5   Post    4.41    -31.5   18.05   1
6   Ant 1.61    -26 18.7    1
6   Ant 1.68    -26 7.4 0
6   Ant 3.93    -29 7.4 0
6   Ant 4.45    -21.5   9.15    .
6   Post    5.48    -28 25.05   1
6   Post    4.11    -48 30.7    1
6   Post    3.20    -23 22.3    1
6   Post    2.77    -28 15.3    1
6   Ant 2.34    -22 10.95   1
6   Ant 2.25    -30 6.2 1
6   Ant 4.16    -24.5   6.6  
6   Ant 4.62    -33 12  0
6   Post    2.32    -31 16.65   1
6   Post    4.03    -31 19  1
6   Post    3.43    -17 12  .
6   Post    3.51    -14 11.1    0
7   Ant 2.99    -30 7.65    1
7   Ant 1.80517419  -25 15.95   1
7   Ant 2.106053494 -32 13.8    1
7   Ant 3.096114016 -29 17.55   1
7   Post    3.167542074 -24 11.45   1
7   Post    3.338268984 -24 22.7    1
7   Post    2.659685183 -32.5   21.95   1
7   Post    3.751749917 -17 8.25    1
7   Ant 2.197529839 -25 8.8 0
7   Ant 3.664137015 -35 10.1    0
7   Ant 3.545702335 -39 9.4 0
7   Ant 2.023625001 -35 10.1    1
7   Post    2.372086883 -33 24.25   1
7   Post    3.582104056 -37.5   17.3    1
7   Post    3.45055168  -38 12.6    1
7   Post    3.841677068 -23 9.8 1
8   Ant 3.432008272 -25 18.6    0
8   Ant 2.136605495 -32.5   14.1    1
8   Ant 1.841190586 -18 7   0
8   Ant 2.15865836  -25.5   6.8 0
8   Post    3.359805409 -30 15.5    1
8   Post    3.631259765 -31 28  1
8   Post    4.674356585 -32.5   20.9    1
8   Post    4.044977037 -25 12.9    0
8   Ant 3.346860731 -28 14.6    1
8   Ant 3.850582629 -46 24.5    1
8   Ant 5.340021635 -31.5   8.5 1
8   Ant 3.980653721 -26 11.6    1
8   Post    3.704121331 -26 14.8    1
8   Post    3.848852913 -31.5   15.8    1
8   Post    4.939191479 -18 9.8 1
8   Post    3.134066196 -15 9.1 1
9   Ant 1.309024248 -25 9   1
9   Ant 3.369404446 -27 11.55   1
9   Ant 1.841284373 -26 13.15   1
9   Ant 3.675231524 -21 12.4    1
9   Post    3.06826061  -22 24.2    1
9   Post    3.59966626  -19 17.3    1
9   Post    4.466268907 -16.5   15.7    1
9   Post    1.882204503 -27 11.3    .
9   Ant 3.89896461  -25 21.1    1
9   Ant 2.295202494 -26 9   1
9   Ant 3.272687389 -24.5   4.95    0
9   Ant 4.201883396 -13 2.7 0
9   Post    4.374845784 -25 10.8    0
9   Post    4.459715564 -15.5   11.8    1
9   Post    3.227763303 -19.5   21.05   0
9   Post    2.517233031 -20.5   31.6    1
;

 

Thanks,
TIm

1 ACCEPTED SOLUTION

Accepted Solutions
StatDave
SAS Super FREQ

Easily done using PROC GEE to fit the desired repeated measures, logistic GEE model and save the predicted event probabilities, followed by PROC LOGISTIC to use the predicted probabilities to generate and save the data for the ROC curve as described in this note, and finally the ROCPLOT macro to compute the Youden index and find the optimal point based on that index. The macro could also be used to find optimal points based on other criteria as described in the macro if desired.

proc gee data=have;
class id position;
model y(event="1")=position resistance force / dist=bin;
repeated subject=id;
output out=out p=p;
run;
proc logistic data=out;
model y(event="1")= / nofit outroc=roc;
roc pred=p;
run;
%rocplot(v,inpred=out, inroc=roc, p=p, id=_opty_, optcrit=youden)

View solution in original post

3 REPLIES 3
StatDave
SAS Super FREQ

Easily done using PROC GEE to fit the desired repeated measures, logistic GEE model and save the predicted event probabilities, followed by PROC LOGISTIC to use the predicted probabilities to generate and save the data for the ROC curve as described in this note, and finally the ROCPLOT macro to compute the Youden index and find the optimal point based on that index. The macro could also be used to find optimal points based on other criteria as described in the macro if desired.

proc gee data=have;
class id position;
model y(event="1")=position resistance force / dist=bin;
repeated subject=id;
output out=out p=p;
run;
proc logistic data=out;
model y(event="1")= / nofit outroc=roc;
roc pred=p;
run;
%rocplot(v,inpred=out, inroc=roc, p=p, id=_opty_, optcrit=youden)
StatDave
SAS Super FREQ

BTW, if you have SAS Viya release 2022.10 (or later), you can now do this directly in PROC LOGISTIC with the new options in the ROCOPTIONS option:

proc logistic data=out
  rocoptions(optimal=youden method=lower id=optstat);
model y(event="1")= / nofit outroc=roc;
roc pred=p;
run;

For all of the available options, such as other optimality criteria and label thinning options, see the PROC LOGISTIC documentation in that release. 

timothy19
Fluorite | Level 6
Thanks for this problem solved!

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 2795 views
  • 6 likes
  • 2 in conversation