turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- General Programming
- /
- Logistic Regression dataset - high vif for a varia...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-22-2014 07:17 PM

Hi

I am trying to identify variables with mulitcollinearity by running a linear regression with VIF option using one of the independent variables as dependent variable. One of the variables with high VIF has no correlation with any other variable. Wonder why its VIF is high though.

Would be great to hear your inputs on this. Thanks in advance !

Accepted Solutions

Solution

07-23-2014
08:56 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ruchikasi

07-23-2014 08:56 AM

I don't quite understand this approach to calculating VIF--what happens if you select a different one of the IV's as the dependent, and what if the first selected IV is highly correlated with another of the set--that would lead to a large VIF. Why not just run the usual full set of IV's? Can you point me to a reference for the method being proposed?

Steve Denham

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ruchikasi

07-23-2014 08:51 AM

these variables are numeric type or character (category) type ?

Solution

07-23-2014
08:56 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ruchikasi

07-23-2014 08:56 AM

I don't quite understand this approach to calculating VIF--what happens if you select a different one of the IV's as the dependent, and what if the first selected IV is highly correlated with another of the set--that would lead to a large VIF. Why not just run the usual full set of IV's? Can you point me to a reference for the method being proposed?

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ruchikasi

07-23-2014 12:09 PM

Numeric. Yes Steve so I was running a vif on a dataset prepared for logistic regression. Was not sure if I could run proc reg vif on with binary variables as dependent. Realized i could and then ran vif with full set of IV which were numeric.

Follow up question - what about categorical variables - if vif the best diagnostic for multicollinearity or should i be looking at some other diagnostic such as spearman correlation/polychor etc?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ruchikasi

07-24-2014 07:53 AM

The estimate method of proc reg and proc logistic are different . proc reg use OLS while proc logistic use ML , therefore there is no need to check vif in proc reg for logistic Model .

BTW, you can't use binary variables as dependent variable as far as I know, the residual of REG assuming ~ N(0,1) , and logistic Model ~ binomial distribution.

Xia Keshan

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Ksharp

07-24-2014 08:28 AM

Working from 's discussion, I found a usage note that gives a method for VIF calculation for logistic regression. Granted, it uses PROC GENMOD rather than LOGISTIC, but for what needs to be done to get the measures, I think this is what you need. The two step process extracts Hessian weights that can be used in PROC REG for multicollinearity diagnostics.

See Usage Note 32471: Testing for equal variances, collinearity, or normality in logit, probit, poisson and other generalized linear modes. The link is:

http://support.sas.com/kb/32/471.html

Hope this gets you moving in the right direction.

Steve Denham