I am testing accuracy of a new biomarker using ROC curve. I am able to obtain sensitivity and specificity at the highest Youden index. I would like to calculate confidence intervals of senstivity and specificity:
I am using the following code:
************ROC curve, adjusted model;
*Obtain intercept and slope using the below command;
ods graphics on;
proc logistic data = CAT ;
model Diabetes_120_(event='1') = VAR8 age sex VAR5 / lackfit rsquare outroc=rocdata1; *VAR8, age, VAR5 are continuous variables and;
roc ;
run;
ods graphics off;
*Calculate a rational cut-off point in ROC curve analyses;
*using logit=intercept+slope(X), where X is cutoff or cutoff=(logit+intercept)/slope;
*Here intercept is -13.5972 and slope is 0.8003;
data CAT3(keep=cutoff prob Sensitivity Specificity
Youden);
set rocdata1;
logit=log(_prob_/(1-_prob_));*calculate logit;
cutoff=(logit+13.5972)/0.8003; *calculate cutoff;
prob= _prob_; *calculate cutoff;
Sensitivity = _SENSIT_; *calculate sensitivity;
Specificity = 1-_1MSPEC_; *calculate specificity;
Youden= _SENSIT_+ (1-_1MSPEC_)-1; *calculate Youden index;
run;
*sort data OGTT_6 by descending Youden index;
Proc sort data=CAT3 ;
by descending Youden ;
run;
Proc print data=CAT3 (firstobs= 1 obs= 10);
TITLE 'First ten values of Youden index';
Run;
Using this I get a cut-off of 14.2085, sensitivity 0.87550, Specificity 0.88064 at highest Youden index 0.7561.
I am using the following code to calculate exact confidence intervals for sensitivity and specificity. However, I am getting wrong confidence intervals. I get correct CIs in the unadjustd model, where I use only VAR8. However, not so in fully adjusted model.
********CI of sensitivity and specificity;
*Create a new variable Diabetes_60_. Give it a value 1 if 1hPG is >=14.2085 or otherwise 0;
data CAT4;
set CAT;
If VAR8>=14.2085 then Diabetes_60_=1;
else Diabetes_60_=0;
run;
*create a new data set OGTT_CI_FA_1, just keeping three variables id, T2D_1 defined by 2hPG and T2D_2 defined 1hPG ;
data CAT5 (keep= id Diabetes_120_ Diabetes_60);
*Keep ids, outcome By gold standard, outcome By new biomarker;
set CAT4;
run;
*sort data OGTT_CI_FA_1 by patient;
Proc sort data=CAT5 ;
by id;
run;
*Do proc freq;
Proc freq data=CAT5 ;
tables (Diabetes_120_)*Diabetes_60_;
run;
*calculate CI;
data CI2;
input Diabetes_120_ Diabetes_60_ Count;
datalines;
0 0 3051
0 1 40
1 0 164
1 1 85
;
*Count values of T2D_1 and T2D_2;
proc sort data=CI2;
by descending Diabetes_120_ descending Diabetes_60_;
run;
proc freq data=CI2 order=data;
weight Count;
tables Diabetes_120_*Diabetes_60_;
run;
title 'Sensitivity';
proc freq data=CI2;
where Diabetes_120_=1;
weight Count;
tables Diabetes_60_ / binomial(level="1");
exact binomial;
run;
title 'Specificity';
proc freq data=CI2;
where Diabetes_120_=0;
weight Count;
tables Diabetes_60_ / binomial(level="0");
exact binomial;
run;
CIs for sensitivity:
Exact Conf Limits |
|
95% Lower Conf Limit | 0.3017 |
95% Upper Conf Limit | 0.4245 |
Specificity:
Exact Conf Limits |
|
95% Lower Conf Limit | 0.9817 |
95% Upper Conf Limit | 0.9902 |
Any suggestion for sas code to calculate bootstrap CIs or exact CIs is appreciated.
If an observation's predicted probability exceeds the cutoff, it's an predicted event.
You're already getting the exact CIs in your output.
But these are not correct becasue sensitivity is around 86% and specificity is 88% at the highest Youden index and the cut-off I am using as mentioned below the logistic regression model sas code.
To correctly obtain the sensitivity and specificity for your cutpoint, save the predicted probabilities from your logistic model using the P= option in the OUTPUT statement. Then in a DATA step, use that cutpoint against the predicted probabilities to classify each observation as an event or nonevent. Then do your FREQ step in the same way as shown in this note to get exact CIs.
Thank you for your reply. I get the predicted probabilites as you suggested. However, I am not sure how to use the cutpoint against the predicted probabilities to classify each observation as an even or nonevent. Can you please help?
If an observation's predicted probability exceeds the cutoff, it's an predicted event.
Thank you Dave. It was helpful.
I am still not getting the right confidence intervals.:
Sensitivity is 0.87550
Exact Conf Limits95% Lower Conf Limit95% Upper Conf Limit
0.7138 |
0.8218 |
Specificity is 0.88062
Exact Conf Limits95% Lower Conf Limit95% Upper Conf Limit
0.9062 |
0.9260 |
Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.