selecting variables for a stepwise logistic from specific groupings

avtar · Posted 09-20-2019 10:11 AM

Hi all,

I am running a logistic stepwise to produce a model of predictive independent variables to some response variable. At the moment I am currently running the below code on a dataset which in turns produces some model.

proc logistic data=inputdata outset=ouptutdata covout; /* Parameter estimates and and their covariances for the final selected model */
   model default_12mnd (event='1')=&faktorlist.
                / selection=stepwise
                  slentry=0.3     /* A significance level of 0.3 is required to allow a variable into the model */
                  slstay=0.35     /* A significance level of 0.35 is required for a variable to stay in the model */
                  details
                  lackfit;      /* A Hosmer and Lemeshow goodness-of-fit test for the final selected model */
   output out=pred p=phat lower=lcl upper=ucl /* The output contains the cumulative predicted probabilities and the corresponding confidence limits, and the individual and cross validated predicted probabilities for each observation */
          predprob=(individual crossvalidate);
run;

My question is assuming the data can be grouped then is there a way to select/force say, a maximum of 2 variables from grouping 1, max 3 variables from grouping 2 etc? Ultimately I would like to try and create a more 'balanced' model as at the moment most variables that end up in the model tend to be from one particular grouping. I understand this this will result in a less accurate model but ultimately would like it to be more practical.

Thank you in advance

Ksharp · Posted 09-21-2019 07:53 AM

LOGISTIC model is GLM . Are not able to achieve your intention .

Why not make a logistic model for each and every group value ? and compare these model .

PaigeMiller · Posted 09-21-2019 08:12 AM

There is no option in the stepwise methods of PROC LOGISTIC to do this type of grouping. It would have to be done manually somehow, by you adding/removing certain variables from the model, and running the model again.

--
Paige Miller

avtar · Posted 09-22-2019 12:09 PM

Thanks for the responses.

Perhaps a stepwise isn't the correct approach for me to use? Suggestions welcome

StatDave · Posted 09-23-2019 11:17 AM

Let's say you have a bunch of variables with names beginning with A, a bunch beginning with B, and so on. I assume that what you want to do is to do selection with the A set and separately within the B set, and so on. If correct, that cannot be done at one shot, but you could do it in a separate PROC LOGISTIC step for each set. In the first step for the A set, you would list all of your variables in the MODEL statement so that all of the A variables are last. You would specify the INCLUDE= option, specifying the number of variables preceding the set of A variables to force all of them to stay in the model. Then use the SELECTION= option (and whatever other options you want such as STOP=) to do the selection only within the A set. Once you select the variables to keep in the A set, you can run the next PROC LOGISTIC step with the selected A variables followed by all of the variables except the B set which again will be put at the very end. Update the INCLUDE= value to include all variables except the B set and do the same selection options to select within the B set. And continue for all sets. Note that there is no guarantee that the final set of selected variables will be the same if you do this in the order A, B, C, ... vs another order like C, A, B, ... but model selection methods themselves are heuristic and so are not guaranteed to find the optimal model. So, in that sense, this doesn't make things any worse.

selecting variables for a stepwise logistic from specific groupings

Re: selecting variables for a stepwise logistic from specific groupings

Re: selecting variables for a stepwise logistic from specific groupings

Re: selecting variables for a stepwise logistic from specific groupings

Re: selecting variables for a stepwise logistic from specific groupings

selecting variables for a stepwise logistic from specific groupings

Re: selecting variables for a stepwise logistic from specific groupings

Re: selecting variables for a stepwise logistic from specific groupings

Re: selecting variables for a stepwise logistic from specific groupings

Re: selecting variables for a stepwise logistic from specific groupings

SAS Innovate 2025: Call for Content