Hello,
I am a PhD student. I am facing the problem of multicollinearity (VIF>10) and I can't drop the variables. The problem is arising due to the use of interaction terms. While searching for the solution, I came to know about the ridge regression and used the following sas code:
proc reg data=OBJ.OBJ1 outvif
outest=vif ridge=0 to 0.05 by .002;
model WCRATIO= wcs BGDUM WIO INTPROMBG INTPROMIO intbgio Wsize Wpbratio Wcfota WCFVOLSD Wcapex WnwctaCASHMS Wlev dd wrdsales;
run;
proc print data=vif;
run;
it is giving the output but without standard error, t-value and p-value. To report the output table for my thesis, I need the t value. How can I get the required information for my thesis.
I am attaching the screenshot of the output.
Please help. It will be really grateful to get the t-value otherwise I will not be able to use this output because there is no way to tell whether coefficients are significant.
Thank you.
Rajneesh Jha
I'd like to look at your output, but many people (including me) will not download MS Office documents because it is a security risk. If you have a screen capture, please put it in a PDF or GIF.
When I run example code from the SAS Documentation, I see the t-values without problem.
I think Paige is seeing the standard errors and p-values for the unridged case. The OP is correct that the standard errors and p-values are not produced for ridge regression. If they are important, you would need to generate bootstrap samples and use the bootstrap distribution to determine if the estimates are sign... Bootstrap estimates for regression coefficients can be tricky, so I suggest you discuss the issue with your advisor.
It means in ridge regression i will not get SE, t value and p value?
The bootstrap link seems to be very complicated to me.
Please suggest me something simpler, if possible.
Thank you.
Dear Sir,
I found standard error in ridge regression and using that i can calculate t value.
using outseb in the code after outvif, output is giving standard error.
But now, I want to know how can I report model fit like F value, adjusted R square, etc.
Can I use anova table of unridged regression and beta estimates and t value of ridge regression?
Please suggest over this.
Thank you
Excellent! I thought I had tested that option, but clearly I was wrong. For completeness, here is how you get standard errors in the OUTEST= data set:
proc reg data=sashelp.cars plots=none
outest=PE outseb ridge=(0 to 0.5 by 0.1);
model mpg_city = Weight enginesize horsepower wheelbase / VIF;
quit;
proc print data=PE(where=(_TYPE_ contains "RIDGE"));
var _TYPE_ _RIDGE_ Intercept--Wheelbase;
run;
I checked the example code but i did not find t value. The output from example code is similar to what i am getting.
I am also attaching the PDF for reference.
Thank You.
If multicollinearity reflects high correlation between an interaction term (e.g., X1*X2) and its lower-level components (e.g., X1 and X2), then centering the continuous predictor variables (X1 and X2) and then computing the interaction variable will help. See http://ww2.amstat.org/publications/jse/v19n3/afshartous.pdf
Centering will not address collinearity between X1 and X2; that correlation would be a fundamental characteristic of the data.
I used this method also(centering) but VIF was still high. It may be due to the interaction with dummy variable.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.