BookmarkSubscribeRSS Feed
antor82
Obsidian | Level 7

Hi everybody

 

does anyone can help me in bootstrapping a cut off value with the youden index method?

 

I can bootstrap AUC with 95%CI but I cannot repeat the same with cut off values (as well as sensitivity and specificity).

 

tks in advance

 

a

 

 

5 REPLIES 5
Ksharp
Super User

I don't know if @Rick_SAS has an interest on it .

Rick_SAS
SAS Super FREQ

Show what you've done so far.

antor82
Obsidian | Level 7

Hi Rick_SAS

I've resampled my dataset with PROC SURVEYSELECT,

then

 

%ODSOff
proc logistic data=BootOut;
	by Replicate;
	model Var1(event='1')=predictor;
	roc 'predictor' predictor;
	roccontrast;
	ods output Rocassociation=BootrocLVEF;
run;
%ODSOn
;

/*extracting only 'predictor' bootstrapped values and excluding 'model' from ads output*/
data BootrocLVEF;
set BootrocLVEF;
where  ROCModel='predictor';
run;

proc univariate data=rocdataLVEFB2_2 noprint;
   var Area;
   output out=WidePctls pctlpre=P_ pctlpts=2.5 97.5 mean=Mean Std=Std; 
run; 

proc print data=WidePctls noobs label;
   format Mean Std P_2_5 P_97_5 6.4;
   label Mean="BootMean" Std="BootStdErr" P_2_5="95% Lower CL" P_97_5="95% Upper CL";
run;

 

So far, I have boostrapped AUC with 95%CI.

 

I also would like to find a cut-off value for this predictor.

 

In the original analysis, I run 

 

proc logistic data=lucia.lucia plots=none;
	model Var1(event='1')=predictor /OUTROC=rocdataLVEF;
	roc 'LVEF' VS_LVEF;
	roccontrast;
run;

/*I see the intercept and predictor coefficient and report in the following code*/
data rocdata2LVEF(keep=cutoff prob Sensitivity Specificity Youden);
	set rocdataLVEF;
	logit=log(_prob_/(1-_prob_));
	cutoff=(logit+14.0995)/0.2727;
	prob=_prob_;
	Sensitivity=_SENSIT_;
	Specificity=1-_1MSPEC_;
	Youden=_SENSIT_+ (1-_1MSPEC_)-1;
run;

proc sort data=rocdata2LVEF;
	by descending Youden;
run;

proc print data=rocdata2LVEF;
title 'cut off - LVEF';
run;

However, this is done on the original dataset but not after bootstrapping.

 

Tks in advance

Rick_SAS
SAS Super FREQ

I see. Based on your program, it looks like you have already seen the bootstrap analysis presented in the article "Discrimination, accuracy, and stability in binary classifiers"?  Would it be possible to ask your question using the data in that article so that we all have access to the data?

antor82
Obsidian | Level 7

Hi @Rick_SAS 

 

sorry for the delay I'm writing.

 

Briefly, referring to the article You suggested to use, I've considered only the first dataset (roc) for simplicity.

 

So.... this is what I've done up to now

 

/*estimating a cut-off value for alb*/
proc logistic data=roc plots=roc;
	model popind(event='0')=alb / outroc=roc_alb;
	roc 'alb' alb;
	output out=out_alb p=pred;
run;

/*Youden index*/
data roc_alb2(keep=cutoff prob Sensitivity Specificity Youden);
	set roc_alb;
	logit=log(_prob_/(1-_prob_));
	cutoff=(logit-2.4646)/-1.0520;
	prob=_prob_;
	Sensitivity=_SENSIT_;
	Specificity=1-_1MSPEC_;
	Youden=_SENSIT_+ (1-_1MSPEC_)-1;
run;

proc sort data=roc_alb2;
	by descending Youden;
run;

proc print data=roc_alb2;
	title 'cut off - alb';
run;

/*95%CI of cut-off*/
proc probit data=roc inversecl(prob=0.33370);
	model popind(event='0')=alb / d=logistic;
	title 'cut-off with 95% CI - alb';
run;

/*95%CI of Sensitivity and Specificity*/
data out_alb;
	set out_alb;
	if pred>0.33 then pop_alb=1;
	else pop_alb=0;
run;

title 'Sensitivity - pop_alb';
proc freq data=out_alb;
	where popind=0;
	tables pop_alb / binomial(level="1");
	exact binomial;
run;

title 'Specificity - pop_alb';
proc freq data=out_alb;
	where popind=1;
	tables pop_alb / binomial(level="0");
	exact binomial;
run;

/*likelihood ratio albumin*/
proc genmod data=out_alb descending;
	class popind pop_alb;
	model pop_alb=popind / dist=binomial link=identity noint;
	store genfit;
run;

data fd;
	length label f $32767;
	infile datalines delimiter=',';
	input label f;
	datalines;
      LR+, b_p1/b_p2
      LR-, (1-b_p1)/(1-b_p2)
      ;
    title 'Likelihood Ratio - pop_alb';
	%NLEST(instore=genfit, fdata=fd, df=10);

I've also attached the results I've found

 

So I finally come to my question: I'm wondering how to repeat these analyses after bootstrapping.

 

Tks again

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1609 views
  • 2 likes
  • 3 in conversation