Programming the statistical procedures from SAS

Multicollinearity Diagnosis for Logistic Regression Using Proc Reg

Accepted Solution Solved
Reply
Occasional Contributor Yan
Occasional Contributor
Posts: 9
Accepted Solution

Multicollinearity Diagnosis for Logistic Regression Using Proc Reg

I am running Proc Reg to check multicollinearity for logistic regression models. Almost all the independent variables are categorical variables. I constructed dummy variables and put K-1 dummies in Proc Reg models. For collinearity diagnosis in Proc Reg, there are two options, COLLIN and COLLINOINT. I am wondering if I use the same model for these two options as the later will exclude the intecept from calculation. Should I put all dummies rather than k-1 dummies while using COLLINOINT option? Thanks!

Accepted Solutions
Solution
‎07-06-2017 10:15 AM
Regular Contributor
Posts: 169

Re: Multicollinearity Diagnosis for Logistic Regression Using Proc Reg

With more than one categorical variable, I would run the collinearity diagnostics using k{i}-1 dummy variables for the i-th categorical variable AND I would include the intercept. By using k{i}-1 dummy variables for the i-th categorical variable, you do not overparameterize the model with the reference level for any of your categorical variables. Inclusion of the intercept along with the k{i} - 1 dummy variables also does not result in an overparameterized model.

If you were to use k{i} dummy variables for each categorical variable and you have two or more categorical variables, then you will end up with an overparameterized model. So, it is best to use k{i}-1 dummy variables and include the intercept.

View solution in original post


All Replies
Solution
‎07-06-2017 10:15 AM
Regular Contributor
Posts: 169

Re: Multicollinearity Diagnosis for Logistic Regression Using Proc Reg

With more than one categorical variable, I would run the collinearity diagnostics using k{i}-1 dummy variables for the i-th categorical variable AND I would include the intercept. By using k{i}-1 dummy variables for the i-th categorical variable, you do not overparameterize the model with the reference level for any of your categorical variables. Inclusion of the intercept along with the k{i} - 1 dummy variables also does not result in an overparameterized model.

If you were to use k{i} dummy variables for each categorical variable and you have two or more categorical variables, then you will end up with an overparameterized model. So, it is best to use k{i}-1 dummy variables and include the intercept.
🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 1 reply
  • 2640 views
  • 0 likes
  • 2 in conversation