11-09-2012 04:14 PM
If Y is regressed on X1 only, R-square is 0.72 and the X1's coefficient is significant.
If Y is regressed on X2 only, R-square is 0.002 and the X2's cofficient is NOT significant.
If Y is regressed on X1 and X2, R-square is 0.76 and the coefficients of both X1 and X2 are significant.
1. Why X2 becomes significant when it is combined with X1? Is there a more intuitive explanation for this?
2. Should I use X2 in the final model?
11-09-2012 04:28 PM
1) One scenario is that Y is highly correlated with X1, and that X2 and Y are nearly orthogonal. That would mean that X1 explains Y very well, but X2 does not. However, after you fit Y to X1, it might be that the RESIDUALS are predicted by X2!
2) I'll let others discuss whether you should include X2. You should probably look at the adjusted R-square to see if there is incremental value in choosing the more complicated model.
11-09-2012 04:54 PM
The fitted model using X1 and X2 is
Y = 0.5 + 1.2*X1 + 1.1*X2 + e
This equation tells me that if I increase one unit in X2 while X1 remains constant, Y will increase 1.1. Is this a wrong conclusion given that when Y is regressed on X1 only, X1 is not significant?
11-24-2012 11:13 AM
Read about the phenomenon called "suppression". One article that describes it is the following:
Lynn HS. Suppression and confounding in action. The American Statistician 2003 Feb;57(1):58-61.