Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- how to check the Multicollinearity between continu...

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-09-2018 12:00 PM

Hi,

I am running a logistic model that includes continuous and categorical variables, should I still need to check Multicollinearity between them? And how to do that?

I know I can ingore the correlation test since they are not all continuous variables, but I am not sure how to check Multicollinearity for these two kinds of variables in logistic regression.

Thanks,

Joe

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to joe66

04-09-2018 01:25 PM

I think it is a very safe assumption that you have some (or maybe a lot of) multi-collinearity, so what are you going to do in the presence of multi-collinearity?

Logistic regression does not perform well in the presence of multi-collinearity; a better method, in my opinion is to use a method that performs well in the presence of multi-collinearity; that method is Partial Least Squares Regression. Unfortunately, there is no logistic Partial Least SQuares in SAS, however the method has been published at https://cedric.cnam.fr/fichiers/RC906.pdf

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to PaigeMiller

04-09-2018 02:57 PM

Thank you for the comments!

I also think it is necessary to check the Multicollinearity among these continuous and categorical predictors, I do not find the "VIF" option in Proc logistic statement, so is there any other way I can check the Multicollinearity?

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to joe66

04-09-2018 05:20 PM - edited 04-09-2018 05:24 PM

Again, if you just assume there is multi-collinearity, what have you lost by not actually checking? Almost all real world data has multi-collinearity. What are you going to do about multi-collinearity if it is present?

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to joe66

04-09-2018 05:36 PM

Adding, if you really want to check the VIF, you can do that in PROC REG, the calculation of VIF is correct for both ordinary least squares regression and for logistic regression, it doesn't matter.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to joe66

04-10-2018 09:56 AM

You could try CORRB option.

```
proc logistic data=sashelp.class;
model sex=age weight height/corrb;
run;
```

You will get correlation matrix of coefficient.

For example, Here HEIGHT and INTERCEPT has -0.82 correlation coefficient. therefore they are correlation.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Ksharp

04-10-2018 10:16 AM - edited 04-10-2018 10:18 AM

@Ksharp wrote:

You could try CORRB option.

`proc logistic data=sashelp.class; model sex=age weight height/corrb; run;`

You will get correlation matrix of coefficient.

For example, Here HEIGHT and INTERCEPT has -0.82 correlation coefficient. therefore they are correlation.

This is not the same as having correlation between the original variables.

As stated in the link given by @StatDave_sas, "Extremely large standard errors for one or more of the estimated parameters and large off-diagonal values in the parameter covariance matrix (COVB option) or correlation matrix (CORRB option) both suggest an ill-conditioned information matrix. However, these conditions can also happen for reasons not related to collinearity among the raw predictors."

But all this still seems to ignore the question: You almost certainly have collinearity among the predictors; all real world data does. So, what are you going to do about collinearity?

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to PaigeMiller

04-10-2018 12:10 PM

Hi Paige,

Thanks for your answer. I am using VIF to deal with multicollinerarity, remove the variable from my model if VIF>10, keep it otherwise. Did I understand your question correctly?

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to joe66

04-10-2018 10:04 AM

See this note about assessing collinearity in generalized linear models.