Programming the statistical procedures from SAS

Compare ROC curves ignoring missing predictors

Accepted Solution Solved
Reply
Trusted Advisor
Posts: 1,189
Accepted Solution

Compare ROC curves ignoring missing predictors

I'm trying to compare AUC for two ROC curves.  But I have missing data for one of the predictors, and I want to ignore the missing values (instead of throwing out those records).

 

I know if I put the predictors in the model, the records will be excluded by LOGISTIC.  So I thought perhaps the ROC statement PRED= specification would be my answer, but unfortunately it throws an error when it encounters a mising value:

 

data have;
  input x1 x2 y;
  cards;
1 1 0
2 2 1
3 . 0
4 2 0
5 1 1
;
run;

proc logistic data=have plots(only)=roc;
  model Y(event='1') = ;
  roc 'x1' pred=x1; 
  roc 'x2' pred=x2; *Throws error improper missing;
run;

 

Is there an easy way to get SAS to compare these two curves?  (Other than running two PROCs and saving the output data etc).

 

I had thought transforming the data might help:

data have;
  input group x y;
  cards;
1 1 0
1 2 1
1 3 0
1 4 0
1 5 1
2 1 0
2 2 1
2 2 0
2 1 1
;
run;

 

That would make it easy to get two ROC curves with a BY-statement, but I still can't see a way to get one chart with both curves, and an AUC comparison.

 

I realize simply ignoring missing values is not always the best approach, but curious if there is a way to do so here. 

 

If not, I suppose I can run PRC LOGISTIC with BY-statement, output the statistics and other results, than plot the curves myself.

 

Thanks.

 

 


Accepted Solutions
Solution
‎03-29-2016 02:20 PM
SAS Employee
Posts: 16

Re: Compare ROC curves ignoring missing predictors

See Usage Note 45339: Comparing the areas under independent ROC curves

http://support.sas.com/kb/45/339.html

View solution in original post


All Replies
SAS Employee
Posts: 16

Re: Compare ROC curves ignoring missing predictors

Any observation has a missing value (appearing as . when printed) in the X1 or X2 variable, then PROC LOGISTIC immediately halts and issues the message that you got.  In this case, adding a WHERE statement to filter out observations with missing values should allow the procedure to run. For example -

 

proc logistic data=have plots(only)=roc;

model Y(event='1') = ;

roc 'x1' pred=x1;

roc 'x2' pred=x2; *Throws error improper missing;

WHERE X2 ~=.;

run;

Trusted Advisor
Posts: 1,189

Re: Compare ROC curves ignoring missing predictors

Thanks @cici0017, but my hope was to include all 5 records when generating the ROC curve for X1, and include 4 records when generating the ROC curve for x2.   

 

So if it were a t-test, I want to do a two-sample t-test, not a paired t-test.  I suppose I want a two-sample comparison of the two ROC curves.

Respected Advisor
Posts: 3,773

Re: Compare ROC curves ignoring missing predictors

I don't know anything about this, but 2-sample implies to me that CLASS might be useful.

SAS Employee
Posts: 16

Re: Compare ROC curves ignoring missing predictors

Do you want to fit two models to the same data set with different predictors and get a comparative ROC graph? You need use the NOFIT option and list all the variables on the MODEL statement. For example -

 

proc logistic data=have plots(only)=roc rocoptions(id=prob);

model Y(event='1') = x1 x2/nofit outroc=roc;

roc 'x1' x1 ;

roc 'x2' x2 ;

run;

proc print data = roc;run;

 

ROC statement automatically generates overlayed ROC curves for you.

 

ROCOverlay22.png

 

 
Trusted Advisor
Posts: 1,189

Re: Compare ROC curves ignoring missing predictors

Yes @cici0017 that is the sort of chart I want.  But note that for one record the value of X1 is missing. 

 

The logistic output notes this:

 

Number of Observations Read 5
Number of Observations Used 4

 

As I understand it that means only 4 obs were used for of the ROC curve of X1 and the ROC curve of X2.

 

My goal was to make the same plot you made (and ideally get a test on difference in AUC), but have the ROC curve of X1 use 4 obs but the ROC curve of X2 use all 5 obs that have data.

 

 

 

 

Solution
‎03-29-2016 02:20 PM
SAS Employee
Posts: 16

Re: Compare ROC curves ignoring missing predictors

See Usage Note 45339: Comparing the areas under independent ROC curves

http://support.sas.com/kb/45/339.html

Trusted Advisor
Posts: 1,189

Re: Compare ROC curves ignoring missing predictors

Thanks much @cici0017.  That note is very helpful, and confirms that in order to compare two independent ROC curves I need to run PROC LOGISTIC twice, save the output data from each, and then overlay the charts myself (and compute the test statistic to compare them).  Bummer, but not the end of the world.  I guess it's the price I pay for missing data.  : )

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 422 views
  • 0 likes
  • 3 in conversation