11-09-2012 04:14 PM
If Y is regressed on X1 only, R-square is 0.72 and the X1's coefficient is significant.
If Y is regressed on X2 only, R-square is 0.002 and the X2's cofficient is NOT significant.
If Y is regressed on X1 and X2, R-square is 0.76 and the coefficients of both X1 and X2 are significant.
1. Why X2 becomes significant when it is combined with X1? Is there a more intuitive explanation for this?
2. Should I use X2 in the final model?
11-09-2012 04:28 PM
1) One scenario is that Y is highly correlated with X1, and that X2 and Y are nearly orthogonal. That would mean that X1 explains Y very well, but X2 does not. However, after you fit Y to X1, it might be that the RESIDUALS are predicted by X2!
2) I'll let others discuss whether you should include X2. You should probably look at the adjusted R-square to see if there is incremental value in choosing the more complicated model.
11-09-2012 04:54 PM
The fitted model using X1 and X2 is
Y = 0.5 + 1.2*X1 + 1.1*X2 + e
This equation tells me that if I increase one unit in X2 while X1 remains constant, Y will increase 1.1. Is this a wrong conclusion given that when Y is regressed on X1 only, X1 is not significant?
11-09-2012 08:28 PM
It could be that not all regressions are performed on the same data. What is the pattern of missing values in X1 and X2, i.e. what are the Ns of the three regressions? - PG
11-24-2012 11:13 AM
Read about the phenomenon called "suppression". One article that describes it is the following:
Lynn HS. Suppression and confounding in action. The American Statistician 2003 Feb;57(1):58-61.