BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
AVA_16
Obsidian | Level 7

Hello,

 

I am looking at the accuracy of a new biomarker, say X (continouous) in diagnosing diabetes (Yes or no) using logistic regression. I am using age (continouous), sex (binary), treatment (binary), BMI (continouous) as covariates in the model I am using the highest Youden index to select a cut-off for this biomarker:

 

 

 

For e.g.,

 

Obs

cutoff

prob

Sensitivity

Specificity

Youden

1

14.2085

0.097430

0.87550

0.88062

0.75612

      

 

Is there any way to estimate confidence intervals for the cutoff and the highest youden index? Any help is appreciated.

1 ACCEPTED SOLUTION

Accepted Solutions
StatDave
SAS Super FREQ

It's not clear what the issue is without seeing your data.

View solution in original post

14 REPLIES 14
ballardw
Super User

As a minimum it might help to share the code you are using to get the index values.

AVA_16
Obsidian | Level 7

Thank you for your reply. Here is the sas code I have used:

 

************ROC curve, fully adjusted model for age sex and bmi;

*btain intercept and slope using the below command;

 

ods graphics on;

proc logistic data = CAT ;

TITLE 'ROC curve of X';

model Diabetes_120_(event='1') = X age sex BMI / lackfit rsquare outroc=rocdata2;

output out=pred predicted=pred;

roc "X";

run;

ods graphics off;

 

 

*Calculate a rational cut-off point in ROC curve analyses;

*using logit=intercept+slope(X), where X is cutoff or cutoff=(logit+intercept)/slope;

*Here intercept is -13.5972 and slope is 0.8003;

 

data CAT3(keep=cutoff prob Sensitivity Specificity

Youden);

set rocdata2;

logit=log(_prob_/(1-_prob_));*calculate logit;

cutoff=(logit+13.5972)/0.8003; *calculate cutoff;

prob= _prob_; *calculate cutoff;

Sensitivity = _SENSIT_; *calculate sensitivity;

Specificity = 1-_1MSPEC_; *calculate specificity;

Youden= _SENSIT_+ (1-_1MSPEC_)-1; *calculate Youden index;

run;

 

*sort data CAT3 by descending Youden index;

 

Proc sort data=CAT3 ;

by descending Youden ;

run;

 

Proc print data=CAT3 (firstobs= 1 obs= 10);

TITLE 'First ten values of Youden index';

Run;

 

 

Rick_SAS
SAS Super FREQ

A review of the literature and a computational method is available in the section "Computing algorithm" of the following paper:

 

Lai CY, Tian L, Schisterman EF. Exact confidence interval estimation for the Youden index and its corresponding optimal cut-pointComput Stat Data Anal. 2010;56(5):1103-1114.

AVA_16
Obsidian | Level 7

Thank you. Is there any way it can be done in the SAS becasue I was not able to open to go through the example  as the link to the example does not open.

StatDave
SAS Super FREQ

I don't think this corresponds to the paper that Rick referred to, but one way to get a confidence interval on the optimal cutpoint on the predictor is to fit the model in PROC PROBIT and use the INVERSECL option. In your example, the following gives a confidence interval around the optimal X cutoff when you replace "youden-prob-level" with the predicted probability associated with your Youden-optimal cutpoint. 

 

proc probit data = CAT inversecl(prob= youden-prob-level);
model Diabetes_120_(event='1') = X age sex BMI / d=logistic;
run;
AVA_16
Obsidian | Level 7

Hello,

 

Thank you. I do get a confidenc interval for the cut-off using the sas code above. However, there is some problem.

 

************ROC curve, fully adjusted model for age sex and bmi;

*obtain intercept and slope using the below command;

ods graphics on;

proc logistic data = CAT ;

model Diabetes_120_(event='1') = X age sex BMI / lackfit rsquare outroc=rocdata2;

output out=pred2 predicted=pred2;

roc "X";

run;

ods graphics off;

 

 

*Calculate a rational cut-off point in ROC curve analyses;

*using logit=intercept+slope(X), where X is cutoff or cutoff=(logit+intercept)/slope;

*Here intercept is -13.5972 and slope is 0.8003;

data CAT3(keep=cutoff prob Sensitivity Specificity

Youden);

set rocdata2;

logit=log(_prob_/(1-_prob_));*calculate logit;

cutoff=(logit+13.5972)/0.8003; *calculate cutoff;

prob= _prob_; *calculate cutoff;

Sensitivity = _SENSIT_; *calculate sensitivity;

Specificity = 1-_1MSPEC_; *calculate specificity;

Youden= _SENSIT_+ (1-_1MSPEC_)-1; *calculate Youden index;

run;

 

*sort data CAT3 by descending Youden index;

Proc sort data=CAT3 ;

by descending Youden ;

run;

Proc print data=CAT3 (firstobs= 1 obs= 10);

TITLE 'First ten values of Youden index';

Run;

 

I get this as ouput from the above command

 

 cutoff        PROB      Sensitivity Specificity Youden1

14.20850.0974300.875500.880620.75612

 

When I use the command below:

 

*confidence intervals for youden index and cutoff;

proc probit data = CAT inversecl(prob= 0.097430);

model Diabetes_120_(event='1') = VAR8 age sex VAR5 / d=logistic;

run;

 

 X          95% Fiducial Limits

11.533011.254311.7997

 

The cut-off I am getting here is different than what I got above, i.e., 14.2085. Any idea as to why is it so?

StatDave
SAS Super FREQ

That is probably because you have not fit the exact same model in PROBIT as you did in LOGISTIC. The variable names are not the same. Be sure to verify that the fitted parameters from PROBIT match those from LOGISTIC. Note also that the INVERSECL option gives confidence intervals for the first predictor listed in the MODEL statement. That should be your X variable. 

AVA_16
Obsidian | Level 7

Hello,

Thank you for your reply. Actually, both are the same variables. Is the difference in cut-off there because in the logistic regression model other than X, age, sex, and BMI.

 

*confidence intervals for youden index and cutoff;

 

proc probit data = CAT inversecl(prob= 0.097430);

model Diabetes_120_(event='1') = X age sex VAR5 / d=logistic;

run;

 

 X          95% Fiducial Limits

11.533011.254311.7997

 

 

However, for calculating a cut-off using outroc data, I am only using intercept and slope and have not accounted for age, sex, and BMI. If so, how can I account these variable while calculting the cut-off?

 

 

*Calculate a rational cut-off point in ROC curve analyses;

*using logit=intercept+slope(X), where X is cutoff or cutoff=(logit+intercept)/slope;

*Here intercept is -13.5972 and slope is 0.8003;

data CAT3(keep=cutoff prob Sensitivity Specificity

Youden);

set rocdata2;

logit=log(_prob_/(1-_prob_));*calculate logit;

cutoff=(logit+13.5972)/0.8003; *calculate cutoff;

prob= _prob_; *calculate cutoff;

Sensitivity = _SENSIT_; *calculate sensitivity;

Specificity = 1-_1MSPEC_; *calculate specificity;

Youden= _SENSIT_+ (1-_1MSPEC_)-1; *calculate Youden index;

run;

 

StatDave
SAS Super FREQ

There is no need to compute the cutpoint on the predictor (X) scale as you do in your DATA CAT3 step. You just need the cutpoint on the probability scale (which is apparently 0.0974). Using that value, PROC PROBIT provides the cutpoint estimate on the X scale using the full model, along with a confidence interval. So, the estimate and confidence interval you got from PROBIT should be what you want.

AVA_16
Obsidian | Level 7

Thank you. This sounds good. However, if I use %rocplot macro to obtain a cut-off for the variable X at the highest Youden index, it comes out to be 12.1 instead of 11.5 (obtained from proc probit). The senstivity, specificity, and the highest Youden index are the same using %rocplot macro and proc probit. Only the cut-off of variable x is different. Any idea, why is it so?

 

The sas code, I used for the %rocplot macro is as below:

 

ods graphics on;

proc logistic data = CAT ;

class sex (ref='0');

model Diabetes_120_(event='1') = X age sex BMI / lackfit rsquare outroc=rocdata2;

output out=pred2 p=pred2;

roc ;

run;

ods graphics off;

 

*I am only using variable X so as get a clear picture on the roc plot. However, the inroc data and predicted probabilities are from the adjusted model;

 

%inc "Path of the %rocplot macro";

%rocplot("9.4", inpred = pred2,inroc = rocdata2, p = pred2,

id = X _OPTY_ _sens_ _spec_ ,optcrit= youden, optsymbolstyle = size=13 color=red weight=bold)

StatDave
SAS Super FREQ

Assuming that the macro reported a unique optimum for the Youden criterion, then verify that the probability value associated with the unique maximum is the value you are specifying in INVERSECL(PROB=). In the Optimal Cutpoints table from the macro, that value is in the Cutpoint column. See Examples 1 and 2 in the ROCPLOT macro documentation.

AVA_16
Obsidian | Level 7

The probability value associated with the unique optimum for the Youden index is the same I am using in the INVERSECL(PROB=). Even then the value of the cut-off for the variable X is coming out different.

StatDave
SAS Super FREQ

It's not clear what the issue is without seeing your data.

AVA_16
Obsidian | Level 7

Thank you for your help. There was some issue with the dataset. Your help is highly appreciated.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 14 replies
  • 9842 views
  • 3 likes
  • 4 in conversation