BookmarkSubscribeRSS Feed
Calcite | Level 5

Hi all,

I use PROC LOGISTIC to to find the most predictive combination of parameters(model) for diagnosis. The code is below:


%let list_param = AP01 AP02 AP03 AP04 AP05 AP06 AP07 AP08 AP09 AP10 AP11 AP12 AP13 AP14 AP15 AP16 AP17 AP18 AP19 AP20
AP21 AP22 AP23 AP24 AP25 AP26 AP27 AP28 AI01 AI02 AI03 AI04 AI05 AI06 AI07 AI08 AI09 AI10 AI11 AI12
AI13 AI14 AI15 AI16 AI17 AI18 AI19 AI20 AI21;

proc logistic data= Source plots(only)= roc;
class reasvis;
model acrstat = &list_param. age sexn reasvis/ CTABLE pprob=0.5 SELECTION=stepwise;
output out=probs6 PREDPROBS= i;


acrstat - is dependent variable with values if the diagnosis is or no.

Predictors - list of parameters - &list_param, age, sexall n umeric variables, reasvis - character variable.


And there are only 9 patients with diagnosis and 320 without it (Total number of subjects - 329) in Source dataset.


And I have some problems with it - actually procedure predicts that no diagnosis will occur. I think it is becouse of small amount of subjects with diagnosis.

But is it possible to find the model which will predict the occurance of diagnosis?


Super User

The large number of variables is perhaps a factor as well. When you look at the 9 diagnosis and all of the variables in the model to any 2 have the same factors? You don't indicate what your variables represent but if they are all of a "yes/no" or "Present/not present" it should be easy to check. If none of them have any in common then I wouldn't be surprised.


You could examine using Selection=Score see if any of the subsets do very well.

Calcite | Level 5




You don't indicate what your variables represent


The variables are numeric with 1 for 'Yes' and 0 for 'No'.

Also the removing of parameters that are not statistically significant is performed by SELECTION=stepwise.

Super User

This is a logistic regression model with a small event rate. There are some known ways to deal with this, including assiging prior probabiliites via a Bayesian methodology.


You can find a better model by removing some variables that don't add value to the regression.

Calcite | Level 5



The removing of parameters that are not statistically significant is performed by SELECTION=stepwise optionBut despite on this I get the classification table without predicted diagnosis.



Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Get the $99 certification deal.jpg



Back in the Classroom!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 3 in conversation