BookmarkSubscribeRSS Feed
kurofufu
Calcite | Level 5

If Y is regressed on X1 only, R-square is 0.72 and the X1's coefficient is significant.

If Y is regressed on X2 only, R-square is 0.002 and the X2's cofficient is NOT significant.

If Y is regressed on X1 and X2, R-square is 0.76 and the coefficients of both X1 and X2 are significant.

Questions:

1. Why X2 becomes significant when it is combined with X1? Is there a more intuitive explanation for this?

2. Should I use X2 in the final model?

7 REPLIES 7
Rick_SAS
SAS Super FREQ

1) One scenario is that Y is highly correlated with X1, and that X2 and Y are nearly orthogonal.  That would mean that X1 explains Y very well, but X2 does not. However, after you fit Y to X1, it might be that the RESIDUALS are predicted by X2!

2) I'll let others discuss whether you should include X2. You should probably look at the adjusted R-square to see if there is incremental value in choosing the more complicated model.

Reeza
Super User

What happens when you add the term x1*x2 (interaction term).

kurofufu
Calcite | Level 5

The adjusted R-square is the same - R-square = 0.7598, Adj R-square = 0.7581

The interaction term X1*X2 is NOT significant.

kurofufu
Calcite | Level 5

The fitted model using X1 and X2 is 

Y = 0.5 + 1.2*X1 + 1.1*X2 + e

This equation tells me that if I increase one unit in X2 while X1 remains constant, Y will increase 1.1. Is this a wrong conclusion given that when Y is regressed on X1 only, X1 is not significant?

PGStats
Opal | Level 21

It could be that not all regressions are performed on the same data. What is the pattern of missing values in X1 and X2, i.e. what are the Ns of the three regressions? - PG

PG
kurofufu
Calcite | Level 5

I just checked. Three regression have the same number of Ns and no missing values in X1 and X2.

1zmm
Quartz | Level 8

Read about the phenomenon called "suppression".  One article that describes it is the following:

   Lynn HS.  Suppression and confounding in action.  The American Statistician 2003 Feb;57(1):58-61.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 1650 views
  • 0 likes
  • 5 in conversation