Advantage of choice1: You use all the data. Disadvantage of choice1: You use all the data. (Yes, that's what I said) Suppose your data were such that 10% of the observations were from the low group, 20% from the medium group, and 70% from the high group. If you fit a simplistic model with only var20 (as you name it), you will fit all the other parameters based predominately on the values seen in the high group. You might follow your first instinct to fit separate models for each group, but then you run into the problem of comparing across models. You really have no way of testing directly whether the parameter for var1 is the same in each of the groups. You could examine the confidence limits, but it would not be as satisfying as creating confidence limits on the difference under choice1. You might wish to consider a more complex model that includes interactions between the categorical variable and the continuous variables, thus giving parameter estimates that can be compared directly. However, you need to watch for having enough data to adequately fit the additional parameters, as you would be going from estimating 21 (20 vars plus an intercept) to as many as 58 (19 vars at three levels each plus an intercept, depending on the parameterization you use). To get good estimates, you really need three times as much data. This is an opinion, and everyone has them, so take what you like and leave the rest. Steve Denham
... View more