Programming the statistical procedures from SAS

Classification table proc logistic

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 8
Accepted Solution

Classification table proc logistic

We could create a classification table in two ways:

1. Using proc logistic with ctable pprob=xxx

Example:

proc logistic desc data=mmse ;

model fn= lhippoc lmidtemp  eicv c_age_a c_age_b ss/ ctable pprob=0.32;

run;


2. Using output and manipulating with data:

proc logistic desc data=mmse ;

model fn= lhippoc lmidtemp  eicv c_age_a c_age_b ss/ ctable pprob=0.32;

output out=ci_with_outl p=rsk;

run;

data ci_with_outl;

set ci_with_outl;

if rsk >=0.32 then pos=1; else pos=0;

run;

proc sort;

by descending pos descending fn;

proc freq order=data;

table pos*fn;

run;


The problem is following. I have received two different classification table, with different numbers of true/faux positives and negatives.

The question is: what is an algorithm of calculation of true/faux positives and negatives in proc logistic ctable?


Accepted Solutions
Solution
‎08-12-2014 11:41 PM
Grand Advisor
Posts: 16,908

Re: Classification table proc logistic

Well,

Re #1, SAS's argument is that the prediction method that uses estimates with the data included in the model is biased. To obtain less biased results they use a different method.

You say SAS says to use proc freq, can you reference that somewhere?

From what I understand, the suggestion is to use proc freq on the ctable output to obtain estimates of the CI.

For smaller samples the sensitivity and specificity will vary more I'm assuming.

Re #2 See post above.

View solution in original post


All Replies
Grand Advisor
Posts: 16,908

Re: Classification table proc logistic

Pretty sure its the same algorithm, check the documentation under classification table and details.

How close are they? One possibility is that they are different due to rounding, the second is that the probability is the opposite of what you expect ie if you modeled a binary such as 0/1 the event is considered 0, not 1, unless specified otherwise.

.

Occasional Contributor
Posts: 8

Re: Classification table proc logistic

I checked the pos variable in Excel. It was correct, pos=1 where risk was 0.32 and higher Smiley Sad

Occasional Contributor
Posts: 8

Re: Classification table proc logistic

They are pretty close. From 147 subjects (47 in desease) 26 of diases and 92 with no disease were correctly classified by proc logistic (true positive/true negative), while 27 and 95 respectively were correctly classified with proc freq. It takes me a difference of 3% for sensitivity and for specificity,

Grand Advisor
Posts: 16,908

Re: Classification table proc logistic

According to the doc >= (GE) is the correct comparison.

If the predicted event probability exceeds or equals some cutpoint value $z \in [0,1]$, the observation is predicted to be an event observation; otherwise, it is predicted as a nonevent. A $2\times 2$ frequency table can be obtained by cross-classifying the observed and predicted responses. The CTABLE option produces this table, and the PPROB= option selects one or more cutpoints. Each cutpoint generates a classification table. If the PEVENT= option is also specified, a classification table is produced for each combination of PEVENT= and PPROB= values.

I can't see that you're missing anything. In fact I can reproduce this with the sample data. I would expect this to work and it doesn't, but that doesn't mean I'm not missing something or doing something wrong. 

Consider opening a track with tech support, an example to replicate the issue is below:

data Remission;

   input remiss cell smear infil li blast temp;

   label remiss='Complete Remission';

   datalines;

1   .8   .83  .66  1.9  1.1     .996

1   .9   .36  .32  1.4   .74    .992

0   .8   .88  .7    .8   .176   .982

0  1     .87  .87   .7  1.053   .986

1   .9   .75  .68  1.3   .519   .98

0  1     .65  .65   .6   .519   .982

1   .95  .97  .92  1    1.23    .992

0   .95  .87  .83  1.9  1.354  1.02

0  1     .45  .45   .8   .322   .999

0   .95  .36  .34   .5  0      1.038

0   .85  .39  .33   .7   .279   .988

0   .7   .76  .53  1.2   .146   .982

0   .8   .46  .37   .4   .38   1.006

0   .2   .39  .08   .8   .114   .99

0  1     .9   .9   1.1  1.037   .99

1  1     .84  .84  1.9  2.064  1.02

0   .65  .42  .27   .5   .114  1.014

0  1     .75  .75  1    1.322  1.004

0   .5   .44  .22   .6   .114   .99

1  1     .63  .63  1.1  1.072   .986

0  1     .33  .33   .4   .176  1.01

0   .9   .93  .84   .6  1.591  1.02

1  1     .58  .58  1     .531  1.002

0   .95  .32  .3   1.6   .886   .988

1  1     .6   .6   1.7   .964   .99

1  1     .69  .69   .9   .398   .986

0  1     .73  .73   .7   .398   .986

;

proc logistic data=Remission outest=betas covout;

   model remiss(event='1')=cell smear infil li blast temp

                /ctable pprob=0.5 ;

   output out=pred p=phat lower=lcl upper=ucl

          predprob=(individual crossvalidate);

run;

data ctable;

    set pred;

    if phat>=0.5 then test=1;

    else test=0;

run;

proc freq data =ctable;

table remiss*test;

run;

Occasional Contributor
Posts: 8

Re: Classification table proc logistic

It is the same things. The different modes take different results.

With proc logistic:

Classification correct

Event (test 1 remiss 1) = 4

Non event (test 0 remiss 0) = 15

Classification uncorrect

Event (test 1 remiss 0) = 3

Non event - (test 0 remiss 1) = 5

With proc freq

Classification correct

Event (test 1 remiss 1) = 5 :smileyalert:

Non event (test 0 remiss 0) = 15

Classification uncorrect

Event (test 1 remiss 0) = 3

Non event - (test 0 remiss 1) = 4

:

Occasional Contributor
Posts: 8

Re: Classification table proc logistic

Ok, I have found how the SAS calculates a classificaiton table:

SAS/STAT(R) 9.2 User's Guide, Second Edition

Now I need to know how it calculates confidence intervals for sensitivity ans specificty, LR- and LR- using the proc logistic.

Valued Guide
Valued Guide
Posts: 679

Re: Classification table proc logistic

I don't have the documentation with me, but I think the ctable option is doing a cross-validation. Each prediction is based on omitting that observation, fitting the model, and predicting the deleted value. This won't be exactly the same as the straight predictions that are in the output table.

Occasional Contributor
Posts: 8

Re: Classification table proc logistic

Yes, thank you, I completely agree with you.

I cannot understand why SAS propose to calculate confidence limits with proc freq IF it's clear that the results would be different.

And I cannot understand how can I now obtain my CLs for specificity and sensitivity. Should I/Could I use an online-calculator?

I don't know.

Grand Advisor
Posts: 16,908

Re: Classification table proc logistic

You can take the output from the ctable and put that into proc freq to obtain confidence intervals. See the link at the end of this post.

You can get the classification table out with following ODS statement before your proc logistic, though it will need reformatting to meet the type required for the proc freq.

ods table Classification=classOut;

proc logistic data=Remission outest=betas covout;

   model remiss(event='1')=cell smear infil li blast temp

                /ctable pprob=0.5 ;

   output out=pred p=phat;

run;

24170 - Estimating sensitivity, specificity, positive and negative predictive values, and other stat...

Occasional Contributor
Posts: 8

Re: Classification table proc logistic

It's a point of discussion. This takes other sensitivity and sensibility. In my case, the differences are over 3% each.

Solution
‎08-12-2014 11:41 PM
Grand Advisor
Posts: 16,908

Re: Classification table proc logistic

Well,

Re #1, SAS's argument is that the prediction method that uses estimates with the data included in the model is biased. To obtain less biased results they use a different method.

You say SAS says to use proc freq, can you reference that somewhere?

From what I understand, the suggestion is to use proc freq on the ctable output to obtain estimates of the CI.

For smaller samples the sensitivity and specificity will vary more I'm assuming.

Re #2 See post above.

Occasional Contributor
Posts: 8

Re: Classification table proc logistic

Frequent Contributor
Frequent Contributor
Posts: 109

Re: Classification table proc logistic

This may have been tacitly alluded to here, but I was thinking that SAS used a leave-one-out (LOO) method for calculating the SEN and SPEC in the ctable option.

Valued Guide
Valued Guide
Posts: 679

Re: Classification table proc logistic

It does, and this is what is meant by my earlier response about cross validation. Leave one out.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 16 replies
  • 2162 views
  • 1 like
  • 5 in conversation