We are running a logisitic regression with a interaction term which is the cross product of two dummy variables, two levels for each dummy. Thus the varaibles in the interaction term can form four groups, such as male with medicine, female with medicine, male w/o medicine, female w/o medicine. Now we want to use one of the 4 groups as referrence group then test the difference. I create a categorical variable(gender_med) with 4 levels to represent the 4 group and replace the interaction term with it in the logistic regression model. Sas created 3 dummy variables for that variable and in the result one of them has collineariy problem with the gender and medicine variable. How should I do?
I am thinking about taking out the gender and medicine variables from the model as actually you can derive those information from the newly created gender_med variables. Am I right?
Correct. Basically, you have just 3 degrees of freedom to represent the 4 categories. You can create a fully specified model with main effects and an interaction or by creating your own combined variable. Either way, you get the same overall model effect. The difference is in the ease of computing the post hoc tests.