Hello,
My project is to measure the adjusted odds ratios of my multiple logistic regression (1 dependent variable and multiple independent variables) based on data from a complex survey. I learnt in school that I should check for multicollinearity of the independent variables. I have several categorical variables with more than just 2 subcategories so I should check the GVIF rather than VIF. However, I am unable to find a procedure that calculates GVIF. I would appreciate any help.
Thank you.
I am not a stats guy, but maybe this paper helps: https://support.sas.com/resources/papers/proceedings17/1404-2017.pdf
There are no VIF or GVIF in PROC LOGISTIC.
But you could use CORRB option to check the correlations between these parameter estimators.
If the corr is high then these two variable is high correlation, you could drop one of them.
proc logistic data=sashelp.heart;
class bp_Status;
model status=weight height bp_Status/corrb;
run;
Hi, there. I am also building a logistic regression model and is also troubled by collinearity. I consulted @PaigeMiller and @StatDave_SAS for the diagnostics and tackling of collinearity in logistic regression.
In short, the conclusions are:
(1) As @Ksharp has mentioned, SAS does not support computing GVIF.
(2) Diagnostics of collinearities can be done by using principal component analysis (PROC PRINCOMP). But caution should be taken when it comes to lowering the dimension of the data using principal component analysis, since (some of) the dimensions it finds may not be good predictors of the dependent variable.
(3) Therefore, it is recommended that the issue of collinearity causing ill-conditioning of the information matrix used in the model-fitting process (instead of lowering the dimension of the data) be handled by methods described in this note.
(4) When it comes to lowering the dimension of the data, both Logistic Partial Least Squares Regression (not available in SAS, but available as a package in R) and penalty-based model selection process such as LASSO (available in PROC HPGENSELECT) can be good choices, with the latter (e.g. LASSO) selecting a subset of the candidate predictors rather than combining them all into a small number of functions. In this case, you may not bother to diagnose collinearity.
You can take a look at my post (How can I perform principal component analysis for logistic regression via SAS?) to see the original replies by @PaigeMiller and @StatDave_SAS.
Good luck!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.