BookmarkSubscribeRSS Feed
Ujjawal
Quartz | Level 8

We get different values of VIF while doing linear regression model with intercept vs. no intercept.  A model with noint yields a higher Vif than one with int. I understand it is because of different rsquare value. Theoretically, model with no intercept makes sense but the values are very high. If i use the thumb rule - variables having VIF >10 should be checked for collinearity, i would have to drop a lot of variables. My interest area is to correct collinearity for a logistic regression model. Hence i am least interested for r square of linear regression model. Which method is correct?

 

proc reg data=sashelp.cars;
model MPG_City = weight Horsepower EngineSize Wheelbase Cylinders / vif noint;
ods output parameterestimates=parest2;
run;
quit;

 

proc reg data=sashelp.cars;
model MPG_City = weight Horsepower EngineSize Wheelbase Cylinders / vif;
ods output parameterestimates=parest2;
run;
quit;

1 REPLY 1
JBerry
Quartz | Level 8

When you choose no intercept, you are essentially forcing it to zero. Therefore, you are saying "when my x's are all 0, I expect my Y to be 0."  That is the only situation that you should use that option.

 

The reason you get high VIF's in a no-intercept situation, is because VIF is a ratio that is related to R2. And when you choose no intercept, you can get much higher R2 values. Why? Thats a long one, but here's a great article on it. 

 

Now that you've learned why R2 goes up, look back at the VIF equation:   VIF = 1 / (1 - R2)   So its easy to see that as VIF goes up, the denominator goes down, which makes overall VIF go up. 🙂

 

 

tl;dr  Using noint is for a specific scenario, which R2 is measured differently and is often higher. Since VIF is related to R2, it also goes up. 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 2675 views
  • 2 likes
  • 2 in conversation