BookmarkSubscribeRSS Feed
nassimsa
Obsidian | Level 7

I am using a Variance Inflation Factor to mesure multiocollineraity and i get the message below. My problem is that i cant drop any variable all are important. How can i solve the problem to keep  G and F without being set to 0.

 

Model is not full rank. Least-squares solutions for the parameters are not unique. Some statistics will be misleading. A reported DF of 0 or B means thatthe estimate is biased.
Note: The following parameters have been set to 0, since the variables are a linear combination of other variables as shown.
G =0.44521 * C
F =0.2637 * C
PH=232.558 * Intercept + 447E-17 * C - 27E-16 * H + 104E-16 * E + 204E-16 * S + 744E-16 * PC - 449E-16 * PG - 15E-16 * PF - 155E-16 * PE +

 

Thank you in advance for answers and help

6 REPLIES 6
PaigeMiller
Diamond | Level 26

@nassimsa wrote:

Model is not full rank. Least-squares solutions for the parameters are not unique. Some statistics will be misleading. A reported DF of 0 or B means thatthe estimate is biased.
Note: The following parameters have been set to 0, since the variables are a linear combination of other variables as shown.


This means that a regression coefficient cannot be computed for one or more of your variables when there are perfect collinearities between the variables. In other words, it is mathematically impossible, and so SAS sets the values of some regression coefficients to zero. You can't have them all, it is impossible. 

 

So you have to re-think the problem, and possibly remove one or more variables. In reality, this should not be a problem, because since you have perfect collinearities in your data, you don't need all of those variables in your model, they are not independent and add no additional information. If you really want, you can compute the effect of the zeroed variable as a mathematical function of the coefficients of the non-zeroed variables <-- but don't do this, eliminate the variables.

 

But really, you need to re-think the problem.

 

--
Paige Miller
nassimsa
Obsidian | Level 7
Thank you for your reply. i think i will re-think the problem as you suggest and devied the analysis in tree steps one for each variable.

PGStats
Opal | Level 21

Try adding restrictions to your model

 

restrict C = 0.44521 * G, C = 0.2637 * F;

PG
nassimsa
Obsidian | Level 7
I tried the restriction but i still have the same problem.
nassimsa
Obsidian | Level 7

I would like to know if and how a Ridge regression can solve the problem? I tried a ridge regression but don't know how to use the results.

 

In any case i need to think of the problem in a different way as suggested in the previous message.

PaigeMiller
Diamond | Level 26

As far as I know, using a Ridge Regression here just masks the problem, it doesn't address the problem that you have variables that are perfectly correlated.

--
Paige Miller

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 707 views
  • 2 likes
  • 3 in conversation