Hello everyone,

I am in the process of selecting subset of variables for the logistic regression model I am building. I have 14 variables - 4 of them are binary categorical variables and one 3-level categorical variables. I have used stepwise selection earlier for choosing models.

I want to try the "score" method and use the combination of variables on the Test data in order to calculate the out of sample F-score (OOS F-score) for each variable combination. I want to see the lift I get in the OOS F-score after adding additional variables.

Unfortunately, I found out the "score" method doesn't handle class variables. Can I create Indicator variables for each level of the categorical variable and run the "Score" method?

I have trouble understanding the output of this, do we consider the categorical variable to be a part of the subset if one of its levels is chosen?

For ex. X1,X2, X3, X4 are the variables being considered. X4 is a categorical variable with 3 levels so I make 3 indicator variables (X4_I1, X4_I2 and X4_I3)

If from the "score"method, Variables included in Model are X1, X2 and X4_I1, do we consider  X4 to be a signifcant variable since one of its levels is chosen?

Lastly, since i have 14 variable, shouldn't the combination of variables be 14C1 + 14C2 + .. +14C13 + 14? I only get 309 models after running "score" method. Is there a reason why only a subset of all possible combinations are shown in the output?

I haven't selected BEST, START, or STOP option.

Thank you in advance for your help,


