I am using a Variance Inflation Factor to mesure multiocollineraity and i get the message below. My problem is that i cant drop any variable all are important. How can i solve the problem to keep G and F without being set to 0.
Model is not full rank. Least-squares solutions for the parameters are not unique. Some statistics will be misleading. A reported DF of 0 or B means thatthe estimate is biased.
Note: The following parameters have been set to 0, since the variables are a linear combination of other variables as shown.
G =0.44521 * C
F =0.2637 * C
PH=232.558 * Intercept + 447E-17 * C - 27E-16 * H + 104E-16 * E + 204E-16 * S + 744E-16 * PC - 449E-16 * PG - 15E-16 * PF - 155E-16 * PE +
Thank you in advance for answers and help
@nassimsa wrote:
Model is not full rank. Least-squares solutions for the parameters are not unique. Some statistics will be misleading. A reported DF of 0 or B means thatthe estimate is biased.
Note: The following parameters have been set to 0, since the variables are a linear combination of other variables as shown.
This means that a regression coefficient cannot be computed for one or more of your variables when there are perfect collinearities between the variables. In other words, it is mathematically impossible, and so SAS sets the values of some regression coefficients to zero. You can't have them all, it is impossible.
So you have to re-think the problem, and possibly remove one or more variables. In reality, this should not be a problem, because since you have perfect collinearities in your data, you don't need all of those variables in your model, they are not independent and add no additional information. If you really want, you can compute the effect of the zeroed variable as a mathematical function of the coefficients of the non-zeroed variables <-- but don't do this, eliminate the variables.
But really, you need to re-think the problem.
Try adding restrictions to your model
restrict C = 0.44521 * G, C = 0.2637 * F;
I would like to know if and how a Ridge regression can solve the problem? I tried a ridge regression but don't know how to use the results.
In any case i need to think of the problem in a different way as suggested in the previous message.
As far as I know, using a Ridge Regression here just masks the problem, it doesn't address the problem that you have variables that are perfectly correlated.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.