Hello all,
Is there any way to check multicollinearity using proc GLM?
proc glm data=tmp ;
class study CNS stk_loc sex;
model fim =study sex age CNS stk_loc /solution ss3 ;
run;
The dependent variable is continuous
All the independent variables are categorical except age which is continuous
I am familiar with VIF in proc reg, but needed to create dummy variables.
any input appreciated
1) You could use PROC GLMSELECT to eliminate these multicollinearity variables.
2)You could use PROC GENMOD + CORRB option to check the correlation between these estiamte coefficient.
proc genmod data=sashelp.heart ;
class status bp_Status sex;
model weight =status bp_Status sex height / corrb ;
quit;
Use PROC GLMMOD to obtain the x matrix used by PROC GLM. Then run PROC REG with the VIF option, using the output of PROC GLMMOD as input to PROC REG.
Or maybe this: https://stackoverflow.com/questions/77531415/sas-basic-analysis-problems
Concurring with @PaigeMiller .
The results are equivalent, but the columns of the data set produced by ODS have names that are directly related to the names of their corresponding effects.
Here's an example of the latter option:
SAS Help Center: Example 49.2 Factorial Screening
ods output DesignPoints = DesignMatrix;
proc glmmod data=Screening;
model y = a|b|c|d|e@2;
run;
proc reg data=DesignMatrix;
model y = a--d_e;
model y = a--d_e / selection = forward
details = summary
slentry = 0.05;
run;
QUIT;
Koen
@bhr-q wrote:
Thanks for your answer, it was interesting to get the tolerance using Proc GLM, but when I used Proc GLM with the tolerance option it didn't show me any tolerance.
Weird.
Here's some info on the tolerance output in PROC GLM :
The TYPE 2 Tolerance is consistent with the TOL option on the MODEL statement in PROC REG, which is 1/VIF.
It is your choice which tolerance to use.
Koen
1) You could use PROC GLMSELECT to eliminate these multicollinearity variables.
2)You could use PROC GENMOD + CORRB option to check the correlation between these estiamte coefficient.
proc genmod data=sashelp.heart ;
class status bp_Status sex;
model weight =status bp_Status sex height / corrb ;
quit;
@Ksharp wrote:
1) You could use PROC GLMSELECT to eliminate these multicollinearity variables.
2)You could use PROC GENMOD + CORRB option to check the correlation between these estiamte coefficient.
proc genmod data=sashelp.heart ; class status bp_Status sex; model weight =status bp_Status sex height / corrb ; quit;
The problem with CORRB is that it only looks for pairwise correlations. Maybe for some data sets, that's fine but it will miss more complicated types of correlations. VIF (and Tolerance) looks for correlation with the combination of all other parameters in the model.
Paige,
I know what you are talking about (linear combination of multiple variables).
But I don't think there is a problem by deleting/detecting a variable one by one(the linear combination of multiple variables must be high correlated with one of these variables ).
If you are not agree with that , you could try PROC GLMSELECT or HPGENSELECT that would take care of your consideration.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.