BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Alain38
Quartz | Level 8

Hello,
When performing many least squares regressions in IML, in some cases it says that the matrix XX' is singular, while when I print it and copy past it hereafter, I can invert it (with deteterminant really close to 0, det = 3.058E-39).

I guess that when I print and copy paste the matrix, values are rounded so the determinant is not strictly equal to 0.

How can I solve this so that I can always perform the regression? Maybe rounding or adding a very small term like 1e-6 could be a fix?

Any suggestion welcome,

 

proc iml;
A = {0.451560519 0.393854273 0.346603836 0.307645432 0.275261503 0.24814273 0.225250527,
0.393854273 0.346607438 0.30763793 0.275263883 0.248142802 0.225250838 0.205772708,
0.346603836 0.30763793 0.275254204 0.24814199 0.225248495 0.205770586 0.189066156,
0.307645432 0.275263883 0.24814199 0.225255504 0.205775816 0.189070797 0.174632917,
0.275261503 0.248142802 0.225248495 0.205775816 0.189070104 0.174632041 0.162058981,
0.24814273 0.225250838 0.205770586 0.189070797 0.174632041 0.162058387 0.151029103,
0.225250527 0.205772708 0.189066156 0.174632917 0.162058981 0.151029103 0.141286734};

A_inv = inv(A); print A_inv;
det_A = det(A); print det_A;
1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

The problem is your data, for some data you have a singular X'X (not XX') matrix, which cannot be inverted. You might want to decide what you want to do in this case, how do you want to handle the situation? There are a number of options. Off the top of my head, here are some possible solutions:

 

  • Remove a variable (or sometimes more than 1 variable) so that the X'X matrix is not singular any more.
  • Use a generalized inverse of X'X which should give you regression coefficients for this data, but they will not be unique
  • Don't compute the regression for this data, instead print an error message of some sort.

You might also want a better understanding of why some data produces a singular X'X matrix, this may be helpful to you in whatever it is you are doing. (Is it because you have dummy variables to represent categories?)

 

Also, why don't you just use PROC REG in this case to produce many different regressions?

--
Paige Miller

View solution in original post

17 REPLIES 17
PaigeMiller
Diamond | Level 26

The problem is your data, for some data you have a singular X'X (not XX') matrix, which cannot be inverted. You might want to decide what you want to do in this case, how do you want to handle the situation? There are a number of options. Off the top of my head, here are some possible solutions:

 

  • Remove a variable (or sometimes more than 1 variable) so that the X'X matrix is not singular any more.
  • Use a generalized inverse of X'X which should give you regression coefficients for this data, but they will not be unique
  • Don't compute the regression for this data, instead print an error message of some sort.

You might also want a better understanding of why some data produces a singular X'X matrix, this may be helpful to you in whatever it is you are doing. (Is it because you have dummy variables to represent categories?)

 

Also, why don't you just use PROC REG in this case to produce many different regressions?

--
Paige Miller
Rick_SAS
SAS Super FREQ

I agree with Paige's comments. To expand on his suggestions, you can use the GINV function in SAS/IML to obtain the generalized inverse. This will enable to find a solution, but as Paige points out, the solution is not unique. Still, if this is part of a simulation study, you probably want the generalized solution rather than aborting the study because of one singular set of data:

 

proc iml;
/* generate three explanatory variables */
z = randfun({100 3}, "Normal");

/* create X so that X[,4] is linearly dependent on X[,1:3] */
LinDep = z[,1] - z[,2] + 2*z[,3];
x = j(nrow(z), 1, 1) || z || LinDep;  /* design matrix with intercept column */

beta = {-3, 7, 2, -4, -2}; /* parameters */
Y = X*beta;                /* response variable */

/* we cannot solve the normal equations because X`X is singular */
A = X`*X;
det_A = det(A); print det_A; /* verify that X`X is singular */
b = solve( X`*X, X`*Y );  /* ERROR: SOLVE function fails */
A_inv = inv(A);           /* ERROR: INV function fails */

/* however, we can use a generalized inverse to obtain a (nonunique) solution */
A_ginv = ginv(A); 
estimate = A_ginv*X`*Y;
print beta estimate;


 

PaigeMiller
Diamond | Level 26

@Rick_SAS: if the GINV function is used, does it produce "consistent" results regardless of the cause of the singularity, so that the results can be compared across all data sets, with either non-singular and singular X'X matrices? Or does GINV change its alrgorithm based upon the type of singularity?

--
Paige Miller
Alain38
Quartz | Level 8
Thank you very much for your suggestions!

In my case b = YX' * inv(XX'), with Y a (1x150) vector and X a (7x150) matrix.

In a paper working with the same sample, they introduce a term equal to 1e-6, saying that it guarantees the stability of the estimator by ensuring that the matrix is always invertible. Is it a viable option? If not, I will use generalized inverse.
PaigeMiller
Diamond | Level 26

@Alain38 wrote:
Thank you very much for your suggestions!

In my case b = YX' * inv(XX'), with Y a (1x150) vector and X a (7x150) matrix.

In a paper working with the same sample, they introduce a term equal to 1e-6, saying that it guarantees the stability of the estimator by ensuring that the matrix is always invertible. Is it a viable option? If not, I will use generalized inverse.

Well I guess that depends on the definition of "viable" and knowing what you are doing (which we really don't know). I could see that depending on the data, 1e-6 might have a negligible effect on the results and for other data 1e-6 might have a large effect on the results. 

 

So ... we don't have enough information to help you decide, but I think you have enough information to evaluate this 1e-6 method.

 

But as far as I can recall, the generalized inverse method does NOT have this particular drawback, but as I said the estimates returned are not unique.

--
Paige Miller
Rick_SAS
SAS Super FREQ

I'm sorry, but I'm not familiar with the phrase "type of singularity." Do you mean rank?

 

Perhaps by "consistency" you are referring to the relationship between INV and GINV when the matrix is nonsingular. Yes, INV and GINV should return the same matrix when the input is nonsingular.

PaigeMiller
Diamond | Level 26

@Rick_SAS wrote:

I'm sorry, but I'm not familiar with the phrase "type of singularity." Do you mean rank?

 

Perhaps by "consistency" you are referring to the relationship between INV and GINV when the matrix is nonsingular. Yes, INV and GINV should return the same matrix when the input is nonsingular.


Yes, I'm not sure exactly what I mean either.

 

If you have a binary variable and use two columns of zeros and ones, let's call them X1 and X2, then you have the singularity that X1+X2=1. But there are other causes of singularities, for example 3*X1 + 2*X2 - 7*X3 = 4*X4 - 0.5*X5 + 1.4*X6, I'm just wondering if the algorithm in GINV does the same thing regardless of how complex or simple the singularity is. Or to use PROC GLM terminology, does the parameterization change depending on something else, where the default parameterization of a binary variable sets the coefficient of the last alphabetical level to zero, but other parameterizations can be requested giving different parameter estimates.

--
Paige Miller
Rick_SAS
SAS Super FREQ

Hi @PaigeMiller,

Both of your examples are rank-1 singularities because you can write one variable as a linear combination of the others. Mathematically, there is no difference between those two examples. If you have a SECOND linear dependency, then you have a rank-2 singularity. All rank-2 singularities are equivalent, but they are different from the rank-1 singularities, as shown by the Fundamental Theorem of Linear Algebra. 

 

If the model is the same, you should get the same estimates regardless of the reference levels. For example, 

if you use two different GLM parameterizations (with different reference levels), then you will get the same parameter estimates.

PaigeMiller
Diamond | Level 26

@Rick_SAS wrote:

Hi @PaigeMiller,

Both of your examples are rank-1 singularities because you can write one variable as a linear combination of the others. Mathematically, there is no difference between those two examples. If you have a SECOND linear dependency, then you have a rank-2 singularity. All rank-2 singularities are equivalent, but they are different from the rank-1 singularities, as shown by the Fundamental Theorem of Linear Algebra. 

 


Probably I wasn't thinking or explaining clearly, so let's drop this part.

 

If the model is the same, you should get the same estimates regardless of the reference levels.

 

Estimates of what? I agree that the predicted values will be the same regardless of the reference levels, but the model coefficients change when the reference level changes.

 

 For example, 

if you use two different GLM parameterizations (with different reference levels), then you will get the same parameter estimates.

 

I don't agree, the parameter estimates change when you change the reference level. Now you can do the algebra and wind up with the exact same model in each case, and get the exact same predicted values, but the output from the PROC will show different coefficients.

--
Paige Miller
Rick_SAS
SAS Super FREQ

Hi @PaigeMiller,

I think we are hijacking this thread from the OP. I'm happy to discuss this further, but maybe you should start another thread.

 

Yes, PROC GLM will obtain different parameter estimates. It uses what is known as a G2 pseudoinverse (which is not unique) for the parameter estimates. However, the GINV function in SAS/IML computes the Moore-Penrose inverse, which is unique. Even if you use two GLM parameterizations (with different reference levels), you will obtain the same generalized inverse and therefore the same parameter estimates.

Rick_SAS
SAS Super FREQ

I'm going to guess that the 1E6 term is a ridging parameter and that they are actually inverting the matrix 

X`*X + 1E6*I

 

If you provide the reference or the exact quote, then we wouldn't have to guess.

Alain38
Quartz | Level 8

The paper is Adrian, Crump and Moench (2015): "Regression-based estimation of dynamic asset pricing models" p.218, eq (36), with more details in appendix A.2. p.229.

I am reproducing their method, but this problem appeared when computing the optimal bandwidth (kernel regressions are here similar to weighted least squares). The dataset consists of stock returns.

I guess that using this "1e-6 trick" in the above case would indeed deliver : X`*X + 1E-6*I

 

They call it the trimming parameter, denoted rho_T.

Rick_SAS
SAS Super FREQ

If you are doing kernel regression, you might be interested in this article about kernel regression in SAS. It mentions that when the bandwidth parameter is too small then "The linear system that you need to solve will be singular." That is one reason why many researchers switched from kernel regression to loess.

PaigeMiller
Diamond | Level 26

@Alain38 wrote:

The paper is Adrian, Crump and Moench (2015): "Regression-based estimation of dynamic asset pricing models" p.218, eq (36), with more details in appendix A.2. p.229.

I am reproducing their method, but this problem appeared when computing the optimal bandwidth (kernel regressions are here similar to weighted least squares). The dataset consists of stock returns.

I guess that using this "1e-6 trick" in the above case would indeed deliver : X`*X + 1E-6*I

 

They call it the trimming parameter, denoted rho_T.


It is my understanding that this ridging parameter (the 1e-6 trick) is useful when the variables are highly (but not perfectly) correlated, but not when the matrix is exactly singular. However, it could certainly be used when the matrix is exactly singular. Nevertheless, this all gets back to the fact that the results are not unique. Which still makes me uncomfortable.

--
Paige Miller

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

From The DO Loop
Want more? Visit our blog for more articles like these.
Discussion stats
  • 17 replies
  • 3070 views
  • 6 likes
  • 3 in conversation