Statistical Procedures

AVA_16 · Posted 03-06-2019 06:51 AM

I am testing accuracy of a new biomarker using ROC curve. I am able to obtain sensitivity and specificity at the highest Youden index. I would like to calculate confidence intervals of senstivity and specificity:

I am using the following code:

************ROC curve, adjusted model;

*Obtain intercept and slope using the below command;

ods graphics on;

proc logistic data = CAT ;

model Diabetes_120_(event='1') = VAR8 age sex VAR5 / lackfit rsquare outroc=rocdata1; *VAR8, age, VAR5 are continuous variables and;

roc ;

run;

ods graphics off;

*Calculate a rational cut-off point in ROC curve analyses;

*using logit=intercept+slope(X), where X is cutoff or cutoff=(logit+intercept)/slope;

*Here intercept is -13.5972 and slope is 0.8003;

data CAT3(keep=cutoff prob Sensitivity Specificity

Youden);

set rocdata1;

logit=log(_prob_/(1-_prob_));*calculate logit;

cutoff=(logit+13.5972)/0.8003; *calculate cutoff;

prob= _prob_; *calculate cutoff;

Sensitivity = _SENSIT_; *calculate sensitivity;

Specificity = 1-_1MSPEC_; *calculate specificity;

Youden= _SENSIT_+ (1-_1MSPEC_)-1; *calculate Youden index;

run;

*sort data OGTT_6 by descending Youden index;

Proc sort data=CAT3 ;

by descending Youden ;

run;

Proc print data=CAT3 (firstobs= 1 obs= 10);

TITLE 'First ten values of Youden index';

Run;

Using this I get a cut-off of 14.2085, sensitivity 0.87550, Specificity 0.88064 at highest Youden index 0.7561.

I am using the following code to calculate exact confidence intervals for sensitivity and specificity. However, I am getting wrong confidence intervals. I get correct CIs in the unadjustd model, where I use only VAR8. However, not so in fully adjusted model.

********CI of sensitivity and specificity;

*Create a new variable Diabetes_60_. Give it a value 1 if 1hPG is >=14.2085 or otherwise 0;

data CAT4;

set CAT;

If VAR8>=14.2085 then Diabetes_60_=1;

else Diabetes_60_=0;

run;

*create a new data set OGTT_CI_FA_1, just keeping three variables id, T2D_1 defined by 2hPG and T2D_2 defined 1hPG ;

data CAT5 (keep= id Diabetes_120_ Diabetes_60);

*Keep ids, outcome By gold standard, outcome By new biomarker;

set CAT4;

run;

*sort data OGTT_CI_FA_1 by patient;

Proc sort data=CAT5 ;

by id;

run;

*Do proc freq;

Proc freq data=CAT5 ;

tables (Diabetes_120_)*Diabetes_60_;

run;

*calculate CI;

data CI2;

input Diabetes_120_ Diabetes_60_ Count;

datalines;

0 0 3051

0 1 40

1 0 164

1 1 85

;

*Count values of T2D_1 and T2D_2;

proc sort data=CI2;

by descending Diabetes_120_ descending Diabetes_60_;

run;

proc freq data=CI2 order=data;

weight Count;

tables Diabetes_120_*Diabetes_60_;

run;

title 'Sensitivity';

proc freq data=CI2;

where Diabetes_120_=1;

weight Count;

tables Diabetes_60_ / binomial(level="1");

exact binomial;

run;

title 'Specificity';

proc freq data=CI2;

where Diabetes_120_=0;

weight Count;

tables Diabetes_60_ / binomial(level="0");

exact binomial;

run;

CIs for sensitivity:

Exact Conf Limits
95% Lower Conf Limit	0.3017
95% Upper Conf Limit	0.4245

Specificity:

Exact Conf Limits
95% Lower Conf Limit	0.9817
95% Upper Conf Limit	0.9902

Any suggestion for sas code to calculate bootstrap CIs or exact CIs is appreciated.

StatDave · Posted 03-07-2019 11:11 AM

If an observation's predicted probability exceeds the cutoff, it's an predicted event.

View solution in original post

StatDave · Posted 03-06-2019 09:36 AM

You're already getting the exact CIs in your output.

AVA_16 · Posted 03-06-2019 09:41 AM

But these are not correct becasue sensitivity is around 86% and specificity is 88% at the highest Youden index and the cut-off I am using as mentioned below the logistic regression model sas code.

StatDave · Posted 03-06-2019 09:54 AM

To correctly obtain the sensitivity and specificity for your cutpoint, save the predicted probabilities from your logistic model using the P= option in the OUTPUT statement. Then in a DATA step, use that cutpoint against the predicted probabilities to classify each observation as an event or nonevent. Then do your FREQ step in the same way as shown in this note to get exact CIs.

AVA_16 · Posted 03-07-2019 07:21 AM

Thank you for your reply. I get the predicted probabilites as you suggested. However, I am not sure how to use the cutpoint against the predicted probabilities to classify each observation as an even or nonevent. Can you please help?

StatDave · Posted 03-07-2019 11:11 AM

If an observation's predicted probability exceeds the cutoff, it's an predicted event.

AVA_16 · Posted 03-08-2019 02:22 AM

Thank you Dave. It was helpful.

AVA_16 · Posted 03-08-2019 03:13 AM

I am still not getting the right confidence intervals.:

Sensitivity is 0.87550

Exact Conf Limits95% Lower Conf Limit95% Upper Conf Limit

0.7138

0.8218

Specificity is 0.88062

Exact Conf Limits95% Lower Conf Limit95% Upper Conf Limit

0.9062

0.9260

Statistical Procedures

How to estimate 95% CI of sensitivity and specificity using logistic regression

Re: How to estimate 95% CI of sensitivity and specificity using logistic regression

Re: How to estimate 95% CI of sensitivity and specificity using logistic regression

Re: How to estimate 95% CI of sensitivity and specificity using logistic regression

Re: How to estimate 95% CI of sensitivity and specificity using logistic regression

Re: How to estimate 95% CI of sensitivity and specificity using logistic regression

Re: How to estimate 95% CI of sensitivity and specificity using logistic regression

Re: How to estimate 95% CI of sensitivity and specificity using logistic regression

Re: How to estimate 95% CI of sensitivity and specificity using logistic regression

Logistic Regression

Binary Logistic Regression

Calculating bootstrapped 95% CI for 99th percentile of a variable

선형회귀(Linear Regression)

Plotting 95% CI with SGPLOT

Follow Us

What is...

Statistical Procedures

Our biggest data and AI event of the year.

Follow Us

What is...