BookmarkSubscribeRSS Feed
Rajneesh_Jha
Fluorite | Level 6

Hello,

 

I am a PhD student. I am facing the problem of multicollinearity (VIF>10) and I can't drop the variables. The problem is arising due to the use of interaction terms. While searching for the solution, I came to know about the ridge regression and used the following sas code:

 

proc reg data=OBJ.OBJ1 outvif
outest=vif ridge=0 to 0.05 by .002;
model WCRATIO= wcs BGDUM WIO INTPROMBG INTPROMIO intbgio Wsize Wpbratio Wcfota WCFVOLSD Wcapex WnwctaCASHMS Wlev dd wrdsales;
run;
proc print data=vif;

run;

 

it is giving the output but without standard error, t-value and p-value. To report the output table for my thesis, I need the t value. How can I get the required information for my thesis.
I am attaching the screenshot of the output.

Please help. It will be really grateful to get the t-value otherwise I will not be able to use this output because there is no way to tell whether coefficients are significant.

 

Thank you.

 

Rajneesh Jha

 

 

8 REPLIES 8
PaigeMiller
Diamond | Level 26

I'd like to look at your output, but many people (including me) will not download MS Office documents because it is a security risk. If you have a screen capture, please put it in a PDF or GIF.

 

When I run example code from the SAS Documentation, I see the t-values without problem. 

--
Paige Miller
Rick_SAS
SAS Super FREQ

I think Paige is seeing the standard errors and p-values for the unridged case. The OP is correct that the standard errors and p-values are not produced for ridge regression. If they are important, you would need to generate bootstrap samples and use the bootstrap distribution to determine if the estimates are sign... Bootstrap estimates for regression coefficients can be tricky, so I suggest you discuss the issue with your advisor.

Rajneesh_Jha
Fluorite | Level 6

It means in ridge regression i will not get SE, t value and p value?

The bootstrap link seems to be very complicated to me. 

Please suggest me something simpler, if possible.

 

Thank you.

Rajneesh_Jha
Fluorite | Level 6

Dear Sir,

 

I found standard error in ridge regression and using that i can calculate t value.

using outseb in the code after outvif, output is giving standard error.

But now, I want to know how can I report model fit like F value, adjusted R square, etc.

Can I use anova table of unridged regression and beta estimates and t value of ridge regression?

Please suggest over this.

 

Thank you

Rick_SAS
SAS Super FREQ

Excellent! I thought I had tested that option, but clearly I was wrong. For completeness, here is how you get standard errors in the OUTEST= data set:

 

proc reg data=sashelp.cars plots=none 
         outest=PE outseb ridge=(0 to 0.5 by 0.1);
   model mpg_city = Weight enginesize horsepower wheelbase / VIF;
quit;

proc print data=PE(where=(_TYPE_ contains "RIDGE")); 
   var _TYPE_ _RIDGE_ Intercept--Wheelbase;
run;
Rajneesh_Jha
Fluorite | Level 6

I checked the example code but i did not find t value. The output from example code is similar to what i am getting.

I am also attaching the PDF for reference.

 

Thank You.

sld
Rhodochrosite | Level 12 sld
Rhodochrosite | Level 12

If multicollinearity reflects high correlation between an interaction term (e.g., X1*X2) and its lower-level components (e.g., X1 and X2), then centering the continuous predictor variables (X1 and X2) and then computing the interaction variable will help. See http://ww2.amstat.org/publications/jse/v19n3/afshartous.pdf

 

Centering will not address collinearity between X1 and X2; that correlation would be a fundamental characteristic of the data.

 

Rajneesh_Jha
Fluorite | Level 6

I used this method also(centering) but VIF was still high. It may be due to the interaction with dummy variable.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 8 replies
  • 4568 views
  • 5 likes
  • 4 in conversation