Hello everyone -- This seems like a very basic question, but I have not been able to find a simple, straight-forward answer to it anywhere. I would appreciate a clear response. I am developing a model using PROC LOGISTIC which has one binary response variable, and four predictive variables: two continuous and two categorical (really binary). I suspect there may be an interaction between the continuous effect "vehicle speed" and the categorical effect "vehicle size". My hypothesis is that "vehicle speed" may have a different effect on accident outcome depending on the size of the vehicle involved. To test this, I initially used stepwise selection on a set of potential factors and interactions, including a nested effect of Speed(Type). This nested effect was selected as significant. However, I began to wonder what would happen if I used an interaction instead of nesting, i.e. Speed*Type. My results are intended for use by people without any background in statistics, so using the interaction might be easier for me to explain in layman's terms. I ran that instead and stepwise analysis preferred Speed*Type over Speed(Type). Interestingly, the parameter estimates for the two options are slightly different but the rest of the model is basically identical, with no change in the AUC value or the results of the H-L test. What I would like to know is this: When is is appropriate to use a nested variable vs. using a normal interaction term? Which would be best in this case? What accounts for the different parameter estimates for Speed*Type vs. Speed(Type)? Thanks for your assistance!
... View more