Programming the statistical procedures from SAS

Logistic Regression CTABLE options, Please Help. Many Thanks

Reply
Frequent Contributor
Posts: 96

Logistic Regression CTABLE options, Please Help. Many Thanks

Hi All,

I have used the CTABLE options in Logistic Regression to specify a cutoff value. It has produced the table below but I am struggling to interpret it..

Your help will be much appreciated. For example if  we want to target customers who are more likely bto respond, do I select anyone with a prob > 0.2..As with a cut off of >0.5 , I don't get many people.

Many Thanks

Classification Table
ProbCorrectIncorrectPercentages
LevelEventNon-EventNon-CorrectSensi-Speci-FALSEFALSE
EventEventtivityficityPOSNEG
0.0543,841426,000356,00012,42056.177.954.5892.8
0.124,715658,000125,00031,54681.443.984.183.54.6
0.1516,000722,00059,89940,26188.128.492.378.95.3
0.29,516753,00029,04746,7459116.996.375.35.8
0.256,078768,00013,93250,18392.410.898.269.66.1
0.32,867777,0005,70853,394935.199.366.66.4
0.351,660779,0002,84654,60193.1399.663.26.5
0.4833781,0001,05055,42893.31.599.955.86.6
0.45377782,00037255,88493.30.710049.76.7
0.5205782,00015056,05693.30.410042.36.7
0.55127782,0008056,13493.30.210038.66.7
0.693782,0005156,16893.30.210035.46.7
0.6571782,0004256,19093.30.110037.26.7
0.758782,0002856,20393.30.110032.66.7
0.7547782,0002356,21493.30.110032.96.7
0.836782,0001556,22593.30.110029.46.7
0.8523782,000956,23893.3010028.16.7
0.913782,000356,24893.3010018.86.7
0.955782,000156,25693.3010016.76.7
10782,000056,26193.30100.6.7
Super Contributor
Posts: 543

Re: Logistic Regression CTABLE options, Please Help. Many Thanks

Hi.

See page 8-10 of this document

http://www.ats.ucla.edu/stat/sas/library/ts274.pdf

maybe it will clear up your question.

I think asking a question such as "do I select anyone >0.2" is not something anyone can help.

Smiley Happy

Anca.

Regular Contributor
Posts: 152

Re: Logistic Regression CTABLE options, Please Help. Many Thanks

At and above a specific cutoff value, sensitivity is the percentage of those with the outcome of interest that are detected using your logistic model:  the percentage of customers who are more likely to respond.  Sensitivity is the percentage of those without the outcome of interest that are detected using the model:  the percentage of customers who are not likely to response.  The percentage of false positives is the percentage of customers that your model predicts as more likely to respond who in fact do not respond.  The percentage of false negatives is the percentage of customers that your model predicts as not likely to respond who do in fact respond. In your example at a cutoff of 0.20 or more, your model picks up only 16.9% [=sensitivity] of customers who are more likely to respond, and 3.7% [100% - 96.3% (=specificity)] of customers who are not likely to respond.  However, 75.3% [=% of false positives] of those your model predicts as likely to respond will in fact not respond, though 94.2% [=100% - 5.8% (% of false negatives)] of those your model predicts as not likely to respond will in fact not respond. You can visualize this better in a two-by-two table like the following: Test prediction      % likely to respond  % not likely to respond  Total   0.20 or more              9,516                  29,047          38,563   < 0.20                  46,745                753,000          799,745 Total                    56,261                782,047          838,308       Sensitivity =  9,516 /  56,261 = 16.9%       Specificity = 753,000 / 782,047 = 96.3%   False positives =  29,047 /  38,563 = 75.3%   False negatives =  46,745 / 799,745 =  5.8%  To pick an appropriate test prediction cutoff, you have to balance the costs vs. the benefits. Using the % false positives as one criterion, of every four customers you tried to contact, on average only one of them would be likely to respond.  If contacting custormers is relatively cheap, you might not worry so much about this false positive % and prefer to increase the percentage of those likely to respond who in fact do respond (that is, to increase the sensitivity).  At a test prediction cutoff of 0.05 or above, only one of nine customers you tried to contact would be likely to respond [=100%-89% false positives] but you would in fact be able to detect more than three-quarters of those customers that would likely respond [sensitivity=77.9%].

Frequent Contributor
Posts: 96

Re: Logistic Regression CTABLE options, Please Help. Many Thanks

Many Thanks, that's really helpful..In my case  contacting custormers is relatively cheap as it is by email. So does it mean, that I could pick up any with 0.1 or more, to increase the sensitivity?

Cheers

Regular Contributor
Posts: 152

Re: Logistic Regression CTABLE options, Please Help. Many Thanks

Yes.  At a cutoff of 0.10 or more, the sensitivity would rise to 43.9% from 16.9% at a cutoff of 0.20 or more, so that you would reach slightly less than one-half of the customers who would be likely to respond. Since the % of false positives at a cutoff of 0.10 or more would equal 83.5%, only one of six customers you tried to contact would be likely to respond.

Ask a Question
Discussion stats
  • 4 replies
  • 248 views
  • 0 likes
  • 3 in conversation