Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 11-13-2018 09:20 AM
(3146 views)

Hello,

When performing many least squares regressions in IML, in some cases it says that the matrix XX' is singular, while when I print it and copy past it hereafter, I can invert it (with deteterminant really close to 0, det = 3.058E-39).

I guess that when I print and copy paste the matrix, values are rounded so the determinant is not strictly equal to 0.

How can I solve this so that I can always perform the regression? Maybe rounding or adding a very small term like 1e-6 could be a fix?

Any suggestion welcome,

```
proc iml;
A = {0.451560519 0.393854273 0.346603836 0.307645432 0.275261503 0.24814273 0.225250527,
0.393854273 0.346607438 0.30763793 0.275263883 0.248142802 0.225250838 0.205772708,
0.346603836 0.30763793 0.275254204 0.24814199 0.225248495 0.205770586 0.189066156,
0.307645432 0.275263883 0.24814199 0.225255504 0.205775816 0.189070797 0.174632917,
0.275261503 0.248142802 0.225248495 0.205775816 0.189070104 0.174632041 0.162058981,
0.24814273 0.225250838 0.205770586 0.189070797 0.174632041 0.162058387 0.151029103,
0.225250527 0.205772708 0.189066156 0.174632917 0.162058981 0.151029103 0.141286734};
A_inv = inv(A); print A_inv;
det_A = det(A); print det_A;
```

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

The problem is your data, for some data you have a singular X'X (not XX') matrix, which cannot be inverted. You might want to decide what you want to do in this case, how do you want to handle the situation? There are a number of options. Off the top of my head, here are some possible solutions:

- Remove a variable (or sometimes more than 1 variable) so that the X'X matrix is not singular any more.
- Use a generalized inverse of X'X which should give you regression coefficients for this data, but they will not be unique
- Don't compute the regression for this data, instead print an error message of some sort.

You might also want a better understanding of why some data produces a singular X'X matrix, this may be helpful to you in whatever it is you are doing. (Is it because you have dummy variables to represent categories?)

Also, why don't you just use PROC REG in this case to produce many different regressions?

--

Paige Miller

Paige Miller

17 REPLIES 17

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

The problem is your data, for some data you have a singular X'X (not XX') matrix, which cannot be inverted. You might want to decide what you want to do in this case, how do you want to handle the situation? There are a number of options. Off the top of my head, here are some possible solutions:

- Remove a variable (or sometimes more than 1 variable) so that the X'X matrix is not singular any more.
- Use a generalized inverse of X'X which should give you regression coefficients for this data, but they will not be unique
- Don't compute the regression for this data, instead print an error message of some sort.

You might also want a better understanding of why some data produces a singular X'X matrix, this may be helpful to you in whatever it is you are doing. (Is it because you have dummy variables to represent categories?)

Also, why don't you just use PROC REG in this case to produce many different regressions?

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I agree with Paige's comments. To expand on his suggestions, you can use the GINV function in SAS/IML to obtain the generalized inverse. This will enable to find a solution, but as Paige points out, the solution is not unique. Still, if this is part of a simulation study, you probably want the generalized solution rather than aborting the study because of one singular set of data:

```
proc iml;
/* generate three explanatory variables */
z = randfun({100 3}, "Normal");
/* create X so that X[,4] is linearly dependent on X[,1:3] */
LinDep = z[,1] - z[,2] + 2*z[,3];
x = j(nrow(z), 1, 1) || z || LinDep; /* design matrix with intercept column */
beta = {-3, 7, 2, -4, -2}; /* parameters */
Y = X*beta; /* response variable */
/* we cannot solve the normal equations because X`X is singular */
A = X`*X;
det_A = det(A); print det_A; /* verify that X`X is singular */
b = solve( X`*X, X`*Y ); /* ERROR: SOLVE function fails */
A_inv = inv(A); /* ERROR: INV function fails */
/* however, we can use a generalized inverse to obtain a (nonunique) solution */
A_ginv = ginv(A);
estimate = A_ginv*X`*Y;
print beta estimate;
```

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

@Rick_SAS: if the GINV function is used, does it produce "consistent" results regardless of the cause of the singularity, so that the results can be compared across all data sets, with either non-singular and singular X'X matrices? Or does GINV change its alrgorithm based upon the type of singularity?

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thank you very much for your suggestions!

In my case b = YX' * inv(XX'), with Y a (1x150) vector and X a (7x150) matrix.

In a paper working with the same sample, they introduce a term equal to 1e-6, saying that it guarantees the stability of the estimator by ensuring that the matrix is always invertible. Is it a viable option? If not, I will use generalized inverse.

In my case b = YX' * inv(XX'), with Y a (1x150) vector and X a (7x150) matrix.

In a paper working with the same sample, they introduce a term equal to 1e-6, saying that it guarantees the stability of the estimator by ensuring that the matrix is always invertible. Is it a viable option? If not, I will use generalized inverse.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

@Alain38 wrote:

Thank you very much for your suggestions!

In my case b = YX' * inv(XX'), with Y a (1x150) vector and X a (7x150) matrix.

In a paper working with the same sample, they introduce a term equal to 1e-6, saying that it guarantees the stability of the estimator by ensuring that the matrix is always invertible. Is it a viable option? If not, I will use generalized inverse.

Well I guess that depends on the definition of "viable" and knowing what you are doing (which we really don't know). I could see that depending on the data, 1e-6 might have a negligible effect on the results and for other data 1e-6 might have a large effect on the results.

So ... we don't have enough information to help you decide, but I think you have enough information to evaluate this 1e-6 method.

But as far as I can recall, the generalized inverse method does NOT have this particular drawback, but as I said the estimates returned are not unique.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I'm sorry, but I'm not familiar with the phrase "type of singularity." Do you mean rank?

Perhaps by "consistency" you are referring to the relationship between INV and GINV when the matrix is nonsingular. Yes, INV and GINV should return the same matrix when the input is nonsingular.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

@Rick_SAS wrote:

I'm sorry, but I'm not familiar with the phrase "type of singularity." Do you mean rank?

Perhaps by "consistency" you are referring to the relationship between INV and GINV when the matrix is nonsingular. Yes, INV and GINV should return the same matrix when the input is nonsingular.

Yes, I'm not sure exactly what I mean either.

If you have a binary variable and use two columns of zeros and ones, let's call them X1 and X2, then you have the singularity that X1+X2=1. But there are other causes of singularities, for example 3*X1 + 2*X2 - 7*X3 = 4*X4 - 0.5*X5 + 1.4*X6, I'm just wondering if the algorithm in GINV does the same thing regardless of how complex or simple the singularity is. Or to use PROC GLM terminology, does the parameterization change depending on something else, where the default parameterization of a binary variable sets the coefficient of the last alphabetical level to zero, but other parameterizations can be requested giving different parameter estimates.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hi @PaigeMiller,

Both of your examples are rank-1 singularities because you can write one variable as a linear combination of the others. Mathematically, there is no difference between those two examples. If you have a SECOND linear dependency, then you have a rank-2 singularity. All rank-2 singularities are equivalent, but they are different from the rank-1 singularities, as shown by the Fundamental Theorem of Linear Algebra.

If the model is the same, you should get the same estimates regardless of the reference levels. For example,

if you use two different GLM parameterizations (with different reference levels), then you will get the same parameter estimates.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

@Rick_SAS wrote:

Hi @PaigeMiller,

Both of your examples are rank-1 singularities because you can write one variable as a linear combination of the others. Mathematically, there is no difference between those two examples. If you have a SECOND linear dependency, then you have a rank-2 singularity. All rank-2 singularities are equivalent, but they are different from the rank-1 singularities, as shown by the Fundamental Theorem of Linear Algebra.

Probably I wasn't thinking or explaining clearly, so let's drop this part.

If the model is the same, you should get the same estimates regardless of the reference levels.

Estimates of what? I agree that the predicted values will be the same regardless of the reference levels, but the model coefficients change when the reference level changes.

For example,

if you use two different GLM parameterizations (with different reference levels), then you will get the same parameter estimates.

I don't agree, the parameter estimates change when you change the reference level. Now you can do the algebra and wind up with the exact same model in each case, and get the exact same predicted values, but the output from the PROC will show different coefficients.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hi @PaigeMiller,

I think we are hijacking this thread from the OP. I'm happy to discuss this further, but maybe you should start another thread.

Yes, PROC GLM will obtain different parameter estimates. It uses what is known as a G2 pseudoinverse (which is not unique) for the parameter estimates. However, the GINV function in SAS/IML computes the Moore-Penrose inverse, which is unique. Even if you use two GLM parameterizations (with different reference levels), you will obtain the same generalized inverse and therefore the same parameter estimates.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I'm going to guess that the 1E6 term is a ridging parameter and that they are actually inverting the matrix

X`*X + 1E6*I

If you provide the reference or the exact quote, then we wouldn't have to guess.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

The paper is Adrian, Crump and Moench (2015): "Regression-based estimation of dynamic asset pricing models" p.218, eq (36), with more details in appendix A.2. p.229.

I am reproducing their method, but this problem appeared when computing the optimal bandwidth (kernel regressions are here similar to weighted least squares). The dataset consists of stock returns.

I guess that using this "1e-6 trick" in the above case would indeed deliver : X`*X + 1E-6*I

They call it the trimming parameter, denoted rho_T.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

@Alain38 wrote:

The paper is Adrian, Crump and Moench (2015): "Regression-based estimation of dynamic asset pricing models" p.218, eq (36), with more details in appendix A.2. p.229.

I am reproducing their method, but this problem appeared when computing the optimal bandwidth (kernel regressions are here similar to weighted least squares). The dataset consists of stock returns.

I guess that using this "1e-6 trick" in the above case would indeed deliver : X`*X + 1E-6*I

They call it the trimming parameter, denoted rho_T.

It is my understanding that this ridging parameter (the 1e-6 trick) is useful when the variables are highly (but not perfectly) correlated, but not when the matrix is exactly singular. However, it could certainly be used when the matrix is exactly singular. Nevertheless, this all gets back to the fact that the results are not unique. Which still makes me uncomfortable.

--

Paige Miller

Paige Miller

**SAS Innovate 2025** is scheduled for May 6-9 in Orlando, FL. Sign up to be **first to learn** about the agenda and registration!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.