08-01-2017 01:11 PM
I have a multivariate model,
Y ~ X1 + X2 + X3 + X4 + X5 + X6 + intercept(?)
I read that it is better to keep the intercept else the regession line will be biased especially it is forced to go through the origin with 6 variables in the equation above.
However, in the legacy program, the intercept is removed and externally studentized residuals are calculated. Would you keep or dont keep the intercept and why so?
08-01-2017 01:14 PM
Context matters. It depends a bit on what type of model and your subject area. In certain cases this approach makes sense and in others it doesn't. I don't believe we have enough information to actually answer your question.
08-01-2017 02:28 PM
It's hard for me to think of a good reason to leave the intercept out.
Even in the famous example of adding soap to water and measuring the amount of suds created, you would expect zero soap to produce zero suds ... a true statement ... but that's not the right thought process. If you are measuring the process near zero on the x-axis (near zero soap) you may find the process to be non-linear and it curves through the origin, in which case a linear regression is not appropriate; and if you are measuring the process away from zero on the x-axis (far away from zero soap) then you may find a linear model fits well in that region of the x-axis, but the model does not go through the origin.
In any case, I would certainly include the intercept in the model, and test to see if the intercept is not statistically different from zero before excluding the intercept. I would also check the data for non-linearity.
09-12-2017 12:49 PM