BookmarkSubscribeRSS Feed
avtar
Calcite | Level 5

Hi all,

 

I am running a logistic stepwise to produce a model of predictive independent variables to some response variable. At the moment I am currently running the below code on a dataset which in turns produces some model.

 

 

proc logistic data=inputdata  outset=ouptutdata covout; /* Parameter estimates and and their covariances for the final selected model */
   model default_12mnd (event='1')=&faktorlist.
                / selection=stepwise
                  slentry=0.3     /* A significance level of 0.3 is required to allow a variable into the model */
                  slstay=0.35     /* A significance level of 0.35 is required for a variable to stay in the model */
                  details
                  lackfit;      /* A Hosmer and Lemeshow goodness-of-fit test for the final selected model */
   output out=pred p=phat lower=lcl upper=ucl /* The output contains the cumulative predicted probabilities and the corresponding confidence limits, and the individual and cross validated predicted probabilities for each observation */
          predprob=(individual crossvalidate);
run;

 

 

My question is assuming the data can be grouped then is there a way to select/force say, a maximum of 2 variables from grouping 1, max 3 variables from grouping 2 etc? Ultimately I would like to try and create a more 'balanced' model as at the moment most variables that end up in the model tend to be from one particular grouping. I understand this this will result in a less accurate model but ultimately would like it to be more practical.

 

Thank you in advance  

4 REPLIES 4
Ksharp
Super User

LOGISTIC model is GLM . Are not able to achieve your intention .

Why not make a logistic model for each and every group value ? and compare these model .

PaigeMiller
Diamond | Level 26

There is no option in the stepwise methods of PROC LOGISTIC to do this type of grouping. It would have to be done manually somehow, by you adding/removing certain variables from the model, and running the model again.

--
Paige Miller
avtar
Calcite | Level 5

Thanks for the responses. 

 

Perhaps a stepwise isn't the correct approach for me to use? Suggestions welcome 

StatDave
SAS Super FREQ

Let's say you have a bunch of variables with names beginning with A, a bunch beginning with B, and so on. I assume that what you want to do is to do selection with the A set and separately within the B set, and so on. If correct, that cannot be done at one shot, but you could do it in a separate PROC LOGISTIC step for each set. In the first step for the A set, you would list all of your variables in the MODEL statement so that all of the A variables are last. You would specify the INCLUDE= option, specifying the number of variables preceding the set of A variables to force all of them to stay in the model. Then use the SELECTION= option (and whatever other options you want such as STOP=) to do the selection only within the A set. Once you select the variables to keep in the A set, you can run the next PROC LOGISTIC step with the selected A variables followed by all of the variables except the B set which again will be put at the very end. Update the INCLUDE= value to include all variables except the B set and do the same selection options to select within the B set. And continue for all sets. Note that there is no guarantee that the final set of selected variables will be the same if you do this in the order A, B, C, ... vs another order like C, A, B, ... but model selection methods themselves are heuristic and so are not guaranteed to find the optimal model. So, in that sense, this doesn't make things any worse.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 1084 views
  • 1 like
  • 4 in conversation