turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Classification table proc logistic

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-11-2014 10:25 AM

We could create a classification table in two ways:

1. Using proc logistic with ctable pprob=xxx

Example:

**proc** **logistic** desc data=mmse ;

model fn= lhippoc lmidtemp eicv c_age_a c_age_b ss/ ctable pprob=**0.32**;

**run**;

2. Using output and manipulating with data:

**proc** **logistic** desc data=mmse ;

model fn= lhippoc lmidtemp eicv c_age_a c_age_b ss/ ctable pprob=**0.32**;

output out=ci_with_outl p=rsk;

**run**;

**data** ci_with_outl;

set ci_with_outl;

if rsk >=**0.32** then pos=**1**; else pos=**0**;

**run**;

**proc** **sort**;

by descending pos descending fn;

**proc** **freq** order=data;

table pos*fn;

**run**;

The problem is following. I have received two different classification table, with different numbers of true/faux positives and negatives.

The question is: what is an algorithm of calculation of true/faux positives and negatives in proc logistic ctable?

Accepted Solutions

Solution

08-12-2014
11:41 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-12-2014 11:41 PM

Well,

Re #1, SAS's argument is that the prediction method that uses estimates with the data included in the model is biased. To obtain less biased results they use a different method.

You say SAS says to use proc freq, can you reference that somewhere?

From what I understand, the suggestion is to use proc freq on the ctable output to obtain estimates of the CI.

For smaller samples the sensitivity and specificity will vary more I'm assuming.

Re #2 See post above.

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-11-2014 10:44 AM

Pretty sure its the same algorithm, check the documentation under classification table and details.

How close are they? One possibility is that they are different due to rounding, the second is that the probability is the opposite of what you expect ie if you modeled a binary such as 0/1 the event is considered 0, not 1, unless specified otherwise.

.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-11-2014 11:01 AM

I checked the pos variable in Excel. It was correct, pos=1 where risk was 0.32 and higher

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-11-2014 11:05 AM

They are pretty close. From 147 subjects (47 in desease) 26 of diases and 92 with no disease were correctly classified by proc logistic (true positive/true negative), while 27 and 95 respectively were correctly classified with proc freq. It takes me a difference of 3% for sensitivity and for specificity,

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-11-2014 02:54 PM

According to the doc >= (GE) is the correct comparison.

If the predicted event probability** exceeds or equals some** cutpoint value , the observation is predicted to be an event observation; otherwise, it is predicted as a nonevent. A frequency table can be obtained by cross-classifying the observed and predicted responses. The CTABLE option produces this table, and the PPROB= option selects one or more cutpoints. Each cutpoint generates a classification table. If the PEVENT= option is also specified, a classification table is produced for each combination of PEVENT= and PPROB= values.

I can't see that you're missing anything. In fact I can reproduce this with the sample data. I would expect this to work and it doesn't, but that doesn't mean I'm not missing something or doing something wrong.

Consider opening a track with tech support, an example to replicate the issue is below:

data Remission;

input remiss cell smear infil li blast temp;

label remiss='Complete Remission';

datalines;

1 .8 .83 .66 1.9 1.1 .996

1 .9 .36 .32 1.4 .74 .992

0 .8 .88 .7 .8 .176 .982

0 1 .87 .87 .7 1.053 .986

1 .9 .75 .68 1.3 .519 .98

0 1 .65 .65 .6 .519 .982

1 .95 .97 .92 1 1.23 .992

0 .95 .87 .83 1.9 1.354 1.02

0 1 .45 .45 .8 .322 .999

0 .95 .36 .34 .5 0 1.038

0 .85 .39 .33 .7 .279 .988

0 .7 .76 .53 1.2 .146 .982

0 .8 .46 .37 .4 .38 1.006

0 .2 .39 .08 .8 .114 .99

0 1 .9 .9 1.1 1.037 .99

1 1 .84 .84 1.9 2.064 1.02

0 .65 .42 .27 .5 .114 1.014

0 1 .75 .75 1 1.322 1.004

0 .5 .44 .22 .6 .114 .99

1 1 .63 .63 1.1 1.072 .986

0 1 .33 .33 .4 .176 1.01

0 .9 .93 .84 .6 1.591 1.02

1 1 .58 .58 1 .531 1.002

0 .95 .32 .3 1.6 .886 .988

1 1 .6 .6 1.7 .964 .99

1 1 .69 .69 .9 .398 .986

0 1 .73 .73 .7 .398 .986

;

proc logistic data=Remission outest=betas covout;

model remiss(event='1')=cell smear infil li blast temp

/ctable pprob=0.5 ;

output out=pred p=phat lower=lcl upper=ucl

predprob=(individual crossvalidate);

run;

data ctable;

set pred;

if phat>=0.5 then test=1;

else test=0;

run;

proc freq data =ctable;

table remiss*test;

run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-11-2014 09:10 PM

It is the same things. The different modes take different results.

With proc logistic:

Classification correct

Event (test 1 remiss 1) = 4

Non event (test 0 remiss 0) = 15

Classification uncorrect

Event (test 1 remiss 0) = 3

Non event - (test 0 remiss 1) = 5

With proc freq

Classification correct

Event (test 1 remiss 1) **= 5** :smileyalert:

Non event (test 0 remiss 0) = 15

Classification uncorrect

Event (test 1 remiss 0) = 3

Non event - (test 0 remiss 1) **= 4**

:

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-11-2014 10:10 PM

Ok, I have found how the SAS calculates a classificaiton table:

SAS/STAT(R) 9.2 User's Guide, Second Edition

Now I need to know how it calculates confidence intervals for sensitivity ans specificty, LR- and LR- using the proc logistic.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-12-2014 03:50 PM

I don't have the documentation with me, but I think the ctable option is doing a cross-validation. Each prediction is based on omitting that observation, fitting the model, and predicting the deleted value. This won't be exactly the same as the straight predictions that are in the output table.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-12-2014 08:40 PM

Yes, thank you, I completely agree with you.

I cannot understand why SAS propose to calculate confidence limits with proc freq IF it's clear that the results would be different.

And I cannot understand how can I now obtain my CLs for specificity and sensitivity. Should I/Could I use an online-calculator?

I don't know.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-12-2014 10:46 PM

You can take the output from the ctable and put that into proc freq to obtain confidence intervals. See the link at the end of this post.

You can get the classification table out with following ODS statement before your proc logistic, though it will need reformatting to meet the type required for the proc freq.

ods table Classification=classOut;

proc logistic data=Remission outest=betas covout;

model remiss(event='1')=cell smear infil li blast temp

/ctable pprob=0.5 ;

output out=pred p=phat;

run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-12-2014 10:53 PM

It's a point of discussion. This takes other sensitivity and sensibility. In my case, the differences are over 3% each.

Solution

08-12-2014
11:41 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-12-2014 11:41 PM

Well,

Re #1, SAS's argument is that the prediction method that uses estimates with the data included in the model is biased. To obtain less biased results they use a different method.

You say SAS says to use proc freq, can you reference that somewhere?

From what I understand, the suggestion is to use proc freq on the ctable output to obtain estimates of the CI.

For smaller samples the sensitivity and specificity will vary more I'm assuming.

Re #2 See post above.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-14-2014 12:59 PM

SAS suggestions for sens-spec CI:

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-18-2014 02:07 PM

This may have been tacitly alluded to here, but I was thinking that SAS used a leave-one-out (LOO) method for calculating the SEN and SPEC in the ctable option.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-18-2014 06:13 PM

It does, and this is what is meant by my earlier response about cross validation. Leave one out.