Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Stat Procs
- /
- how to check the Multicollinearity between continuous and categorical ...

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 04-09-2018 12:00 PM
(16670 views)

Hi,

I am running a logistic model that includes continuous and categorical variables, should I still need to check Multicollinearity between them? And how to do that?

I know I can ingore the correlation test since they are not all continuous variables, but I am not sure how to check Multicollinearity for these two kinds of variables in logistic regression.

Thanks,

Joe

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

See this note about assessing collinearity in generalized linear models.

8 REPLIES 8

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I think it is a very safe assumption that you have some (or maybe a lot of) multi-collinearity, so what are you going to do in the presence of multi-collinearity?

Logistic regression does not perform well in the presence of multi-collinearity; a better method, in my opinion is to use a method that performs well in the presence of multi-collinearity; that method is Partial Least Squares Regression. Unfortunately, there is no logistic Partial Least SQuares in SAS, however the method has been published at https://cedric.cnam.fr/fichiers/RC906.pdf

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thank you for the comments!

I also think it is necessary to check the Multicollinearity among these continuous and categorical predictors, I do not find the "VIF" option in Proc logistic statement, so is there any other way I can check the Multicollinearity?

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Again, if you just assume there is multi-collinearity, what have you lost by not actually checking? Almost all real world data has multi-collinearity. What are you going to do about multi-collinearity if it is present?

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Adding, if you really want to check the VIF, you can do that in PROC REG, the calculation of VIF is correct for both ordinary least squares regression and for logistic regression, it doesn't matter.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

You could try CORRB option.

```
proc logistic data=sashelp.class;
model sex=age weight height/corrb;
run;
```

You will get correlation matrix of coefficient.

For example, Here HEIGHT and INTERCEPT has -0.82 correlation coefficient. therefore they are correlation.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

@Ksharp wrote:

You could try CORRB option.

`proc logistic data=sashelp.class; model sex=age weight height/corrb; run;`

You will get correlation matrix of coefficient.

For example, Here HEIGHT and INTERCEPT has -0.82 correlation coefficient. therefore they are correlation.

This is not the same as having correlation between the original variables.

As stated in the link given by @StatDave, "Extremely large standard errors for one or more of the estimated parameters and large off-diagonal values in the parameter covariance matrix (COVB option) or correlation matrix (CORRB option) both suggest an ill-conditioned information matrix. However, these conditions can also happen for reasons not related to collinearity among the raw predictors."

But all this still seems to ignore the question: You almost certainly have collinearity among the predictors; all real world data does. So, what are you going to do about collinearity?

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hi Paige,

Thanks for your answer. I am using VIF to deal with multicollinerarity, remove the variable from my model if VIF>10, keep it otherwise. Did I understand your question correctly?

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

See this note about assessing collinearity in generalized linear models.

**Available on demand!**

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.