turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- Classification table proc logistic

Topic Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

12-09-2015 10:45 AM

Hi all,

I use PROC LOGISTIC to to find the most predictive combination of parameters(model) for diagnosis. The code is below:

%let list_param = AP01 AP02 AP03 AP04 AP05 AP06 AP07 AP08 AP09 AP10 AP11 AP12 AP13 AP14 AP15 AP16 AP17 AP18 AP19 AP20

AP21 AP22 AP23 AP24 AP25 AP26 AP27 AP28 AI01 AI02 AI03 AI04 AI05 AI06 AI07 AI08 AI09 AI10 AI11 AI12

AI13 AI14 AI15 AI16 AI17 AI18 AI19 AI20 AI21;

**proc logistic data= Source plots(only)= roc; ****class reasvis;****model acrstat = &list_param. age sexn reasvis/ CTABLE pprob=0.5 SELECTION=stepwise;****output out=probs6 PREDPROBS= i;****run;**

**acrstat - **is dependent variable with values if the diagnosis is or no.

Predictors - list of parameters - &list_param, age, sex** - **all n umeric variables, **reasvis - **character variable.

And there are only 9 patients with diagnosis and 320 without it (Total number of subjects - 329) in Source dataset.

And I have some problems with it - actually procedure predicts that no diagnosis will occur. I think it is becouse of small amount of subjects with diagnosis.

But is it possible to find the model which will predict the occurance of diagnosis?

Thanks.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Ektchup

12-09-2015 11:17 AM

The large number of variables is perhaps a factor as well. When you look at the 9 diagnosis and all of the variables in the model to any 2 have the same factors? You don't indicate what your variables represent but if they are all of a "yes/no" or "Present/not present" it should be easy to check. If none of them have any in common then I wouldn't be surprised.

You could examine using Selection=Score see if any of the subsets do very well.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ballardw

12-10-2015 02:52 AM

Thanks, ballardw!

__You don't indicate what your variables represent__

The variables are numeric with 1 for 'Yes' and 0 for 'No'.

Also the removing of parameters that are not statistically significant is performed by** SELECTION=stepwise.**

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Ektchup

12-09-2015 01:03 PM

This is a logistic regression model with a small event rate. There are some known ways to deal with this, including assiging prior probabiliites via a Bayesian methodology.

http://stats.stackexchange.com/questions/10236/applying-logistic-regression-with-low-event-rate

You can find a better model by removing some variables that don't add value to the regression.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

12-10-2015 02:41 AM

HI, Reeza

Thanks.

The removing of parameters that are not statistically significant is performed by** SELECTION=stepwise **option**. **But despite on this I get the classification table without predicted diagnosis.