BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Junyong
Pyrite | Level 9

Hello. I am simulating something using IML and the simulation includes a matrix inverse. For example,

proc iml;
	do SIMULATION=1 to 100;
		submit SIMULATION;
			%put ITERATION &SIMULATION;
		endsubmit;
		MATRIX=ranbin(j(100,100,1),1,0.1);
		INVERSE=inv(MATRIX);
	end;
quit;

The problem is that the target matrix is not always invertible. The random matrix is singular sometimes and SAS aborts at the moment. For example, the SAS code above stops at the 29th iteration. To avoid any interruption, I can add one more line as follows.

proc iml;
	do SIMULATION=1 to 100;
		submit SIMULATION;
			%put ITERATION &SIMULATION;
		endsubmit;
		MATRIX=ranbin(j(100,100,1),1,0.1);
		if det(MATRIX)=0 then INVERSE=ginv(MATRIX);
		else INVERSE=inv(MATRIX);
	end;
quit;

I can use a pseudoinverse after checking the determinant. The problem is that the target matrix is bigger than 10,000×10,000 and both DET and GINV functions are computationally expensive. In every iteration, this SAS code computes two numbers det(MATRIX) and INVERSE. I wonder if there is any iferror-style solution that is cheaper than my approach. I do not want to use any DET function more. Thanks in advance.

1 ACCEPTED SOLUTION

Accepted Solutions
Rick_SAS
SAS Super FREQ

I'd still like to know what you are trying to accomplish with your simulation. For example, I have simulated the distribution of eigenvalues for random orthogonal matrices, and the distribution of eigenvalues for random symmetric matrices where each element is from a given distribution. In your example, it looks like you are using random symmetric 0/1 matrices, but it is not clear what you are trying to find out about this population of matrices. There have been many mathematical papers written about the singular varieties of (0,1) matrices; some might contains mathematical facts that might be helpful in your investigation.

 

To answer your question, the SAS/IML documentation shows how you can use the CALL PUSH statement and the RESUME statement to handle errors in SAS/IML modules. One of the examples shows how to recover from an error when you call the INV function on a singular matrix. Here is a little SAS/IML module that demonstrates the technique:

 

proc iml;
/* EXtended version of the INV function. If the matrix is not 
   invertible then return a missing value. Otherwise return the 
   inverse. See the IML doc :  
   http://support.sas.com/documentation/cdl/en/imlug/68150/HTML/default/viewer.htm#imlug_genstmts_sect010.htm
*/
start InvEx(A); 
   on_error = "if error then do;  AInv = .;  resume; end;";
   call push(on_error);   /* PUSH code that will be executed if an error occurs */
   error = 1;             /* set flag before calling INV */
   AInv = inv(A);
   error = 0;             /* remove flag after calling INV */
   return ( AInv );
finish;

A1 = {1 1,
      0 1};
B1 = InvEx(A1);
A2 = {1 1,
      1 1};
B2 = InvEx(A2);

print B1, B2;

 

 

View solution in original post

6 REPLIES 6
Rick_SAS
SAS Super FREQ

Saying " I am simulating something" doesn't provide many details, but let me give a brief overview of some solutions.

 

First, the determinant is not a good numerical test for singularity. The concept you want to use is the numerical rank of the matrix.

You can read about how to compute the rank of a matrix in SAS, which includes a section on why the determinant is not a good tool. The techniques in this article are mostly used for rectangular matrices, which don't have inverses.

 

Second, forming the explicit inverse of a 10000x10000 matrix can be numerically unstable and inefficient. A better approach is to numerically solve the problem A*x = b for the vector x that satisfies the matrix equation. See the article "Do you really need that matrix inverse?" for more information. Also see "Solving linear systems, which technique is fastest."

 

Although you do not say it, I am going to guess that the "something" that you are simulating might involve a SYMMETRIC matrix such as the covariance matrix or the crossproducts matrix. If your matrix is a covariance, then your best approach is to use the ROOT function, which involves an argument that tells IML to return missing values when the matrix is singular (not positive definite). For symmetric matrices, this will be the fastest method because it requires only one call. If the matrix is not singular, you get the Cholesky factorization, which you can use to solve the linear system.

 

Junyong
Pyrite | Level 9

In fact, I tried to attach this to my post below, but it disappears as soon as I attach repeatedly. I post here instead. Anyway, many thanks, Rick, but I think I should attach few things. Here I update the SAS code I attached before.

/**************************************************
Several Monte Carlo Examples
Junyong Kim
October 28, 2017
**************************************************/
resetline;
ods html close;
ods graphics off;
ods listing;

%let NOFSIMULATION=30;
%let SIZE=300;

/**************************************************
1. The Way I Firstly Intended
 - MATRIX is randomly drawn in each iteration
 - MATRIX is symmetric, but singular sometimes
 - This code stops at the 29th iteration
**************************************************/
proc iml;
	do SIMULATION=1 to &NOFSIMULATION.;
		submit SIMULATION;
			%put Iteration=&SIMULATION;
		endsubmit;
		MATRIX=i(&SIZE.);
		do I=2 to &SIZE.;
			do J=1 to I-1;
				MATRIX[I,J]=ranbin(1,1,0.015);
				MATRIX[J,I]=MATRIX[I,J];
			end;
		end;
		INVERSE=inv(MATRIX);
		SOLUTION=INVERSE*j(&SIZE.,1,1);
	end;
quit;

/**************************************************
2. Detouring Using a Pseudoinverse
 - The code checks the determinant of MATRIX
 - Computes the inverse if it is zero
 - Computes the pseudoinverse if it is not
 - This code does not stop during the iteration
 - Takes 2.92 secs (real time) in my laptop
**************************************************/
proc iml;
	do SIMULATION=1 to &NOFSIMULATION.;
		submit SIMULATION;
			%put Iteration=&SIMULATION;
		endsubmit;
		MATRIX=i(&SIZE.);
		do I=2 to &SIZE.;
			do J=1 to I-1;
				MATRIX[I,J]=ranbin(1,1,0.015);
				MATRIX[J,I]=MATRIX[I,J];
			end;
		end;
		if det(MATRIX)=0 then INVERSE=ginv(MATRIX);
		else INVERSE=inv(MATRIX);
		SOLUTION=INVERSE*j(&SIZE.,1,1);
	end;
quit;

/**************************************************
3. Rank Instead of Determinant
 - As Rick mentioned, altered DET by RANK
 - DET is computationally poor
 - RANK works well in checking singularity
 - This code does not stop during the iteration
 - Takes 9.31 secs (real time) in my laptop
 ★ Seems that DET is way cheaper computationally
 ☆ round(trace(ginv(A)*A)) computes A's rank
 ☆ Modified 171029 0345; thanks Ian
**************************************************/
proc iml;
	do SIMULATION=1 to &NOFSIMULATION.;
		submit SIMULATION;
			%put Iteration=&SIMULATION;
		endsubmit;
		MATRIX=i(&SIZE.);
		do I=2 to &SIZE.;
			do J=1 to I-1;
				MATRIX[I,J]=ranbin(1,1,0.015);
				MATRIX[J,I]=MATRIX[I,J];
			end;
		end;
		*if rank(MATRIX)=&SIZE. then INVERSE=inv(MATRIX);
		if round(trace(ginv(MATRIX)*MATRIX))=&SIZE. then INVERSE=inv(MATRIX);
		else INVERSE=ginv(MATRIX);
		SOLUTION=INVERSE*j(&SIZE.,1,1);
	end;
quit;

/**************************************************
4. SOLVE Instead of INV
 - As Rick mentioned, altered INV by SOLVE
 - SOLVE is numerically faster
 - INV is more general but relatively slower
 - b=inv(A)*c and b=solve(A,c) are equivalent
 - (if A is invertible)
 - This code does not stop during the iteration
 - Takes 2.42 secs (real time) in my laptop
 ★ Seems that SOLVE is computationally cheaper
**************************************************/
proc iml;
	do SIMULATION=1 to &NOFSIMULATION.;
		submit SIMULATION;
			%put Iteration=&SIMULATION;
		endsubmit;
		MATRIX=i(&SIZE.);
		do I=2 to &SIZE.;
			do J=1 to I-1;
				MATRIX[I,J]=ranbin(1,1,0.015);
				MATRIX[J,I]=MATRIX[I,J];
			end;
		end;
		if det(MATRIX)=0 then SOLUTION=ginv(MATRIX)*j(&SIZE.,1,1);
		else SOLUTION=solve(MATRIX,j(&SIZE.,1,1));
	end;
quit;

/**************************************************
5. Cholesky Decomposition Using ROOT
 - As Rick mentioned, utilized ROOT
 - ROOT computes a Cholesky upper triangular
 - ROOT is the fastest when MATRIX is symmetric
 - Both inv(A) and root(A)*root(A)` are the same
 - This code stops at the 1st iteration
 ★ ROOT requires MATRIX to be positive-definite
**************************************************/
proc iml;
	do SIMULATION=1 to &NOFSIMULATION.;
		submit SIMULATION;
			%put Iteration=&SIMULATION;
		endsubmit;
		MATRIX=i(&SIZE.);
		do I=2 to &SIZE.;
			do J=1 to I-1;
				MATRIX[I,J]=ranbin(1,1,0.015);
				MATRIX[J,I]=MATRIX[I,J];
			end;
		end;
		if det(MATRIX)=0 then SOLUTION=ginv(MATRIX)*j(&SIZE.,1,1);
		else do;
			INVU=inv(root(MATRIX));
			INVERSE=INVU*INVU`;
			SOLUTION=INVERSE*j(&SIZE.,1,1);
		end;
	end;
quit;

This code includes five different ways. The purpose is to invert MATRIX, which is random and symmetric but singular sometimes.

1. Use just a INV function (aborts at the 24th iteration because MATRIX is singular)
2. Check using DET and GINV if singular (does not abort, takes about 3 seconds in my laptop)
3. Check using RANK and GINV if rank-deficient (does not abort, but takes about 9 seconds)
4. Check using DET and SOLVE if invertible (does not abort, takes about 2.5 seconds)
5. Check using DET and ROOT if invertible (aborts at the 1st iteration because MATRIX is not positive-definite)

Here is my conclusion after all things considered.

1. It seems SOLVE rather than INV is computationally cheaper slightly if I do not need the inverse matrix per se.
2. However, RANK is way too much expensive computationally rather than DET.
3. Though RANK is expensive, it may be still useful because DET is impossible if MATRIX is large enough.
(SAS shows the message: ERROR: (execution) Unable to allocate sufficient memory. At least 2147483647 more bytes required)
4. ROOT is not applicable because it requires more than symmetriness, i.e. positive-definiteness.

Though your answer is still valuable for me, my question is still unanswered. Can I write a following code in SAS?

1. Invert MATRIX using inv(MATRIX). Just go to the next step if it is successful and skip Step 2.
2. Invert MATRIX using ginv(MATRIX) only if Step 1 is not successful.

Now my IML code requires one more step, i.e. checking if MATRIX is singular (using DET or RANK). In each iteration, SAS computes either (i) DET (or RANK) and INV or (ii) DET (or RANK) and GINV, but I only need INV (or GINV). DET (or RANK) is what I do not want. SOLVE is good, but I must use a pseudoinverse if MATRIX is singular.

 

p.s. The third version is modified (171029 0345). A's rank is computed not by rank(A) but by round(trace(ginv(A)*A)). This seems to be more expensive computationally since this involves GINV already. Thanks Ian.

IanWakeling
Barite | Level 11

I don't think that there is any way to recover from the singular matrix error with the INV function, so the initial test must be done if you go this route.

 

Please check the 3rd code example above as this is not doing what you think.  Specifically, the statement  "if rank(MATRIX)=&SIZE." will always evaluate as false, so GINV is used for all the simulations, and the timings are more a reflection of how much slower GINV is when compared to INV.  The RANK function in IML returns the rank order of the elements within a matrix.  To get the matrix rank, you will have to use one of the methods in Rick's article,  see link in the original topic which contains code for a RankMatrix function, however this will be expensive as the preferred methods involve the calculation of an SVD or a generalized inverse.

Junyong
Pyrite | Level 9

Many thanks, Rick, but I think I should attach few things. Here I update the SAS code I attached before.

/**************************************************
Several Monte Carlo Examples
Junyong Kim
October 28, 2017
**************************************************/
resetline;
ods html close;
ods graphics off;
ods listing;

%let NOFSIMULATION=30;
%let SIZE=300;

/**************************************************
1. The Way I Firstly Intended
 - MATRIX is randomly drawn in each iteration
 - MATRIX is symmetric, but singular sometimes
 - This code stops at the 29th iteration
**************************************************/
proc iml;
	do SIMULATION=1 to &NOFSIMULATION.;
		submit SIMULATION;
			%put Iteration=&SIMULATION;
		endsubmit;
		MATRIX=i(&SIZE.);
		do I=2 to &SIZE.;
			do J=1 to I-1;
				MATRIX[I,J]=ranbin(1,1,0.015);
				MATRIX[J,I]=MATRIX[I,J];
			end;
		end;
		INVERSE=inv(MATRIX);
		SOLUTION=INVERSE*j(&SIZE.,1,1);
	end;
quit;

/**************************************************
2. Detouring Using a Pseudoinverse
 - The code checks the determinant of MATRIX
 - Computes the inverse if it is zero
 - Computes the pseudoinverse if it is not
 - This code does not stop during the iteration
 - Takes 2.92 secs (real time) in my laptop
**************************************************/
proc iml;
	do SIMULATION=1 to &NOFSIMULATION.;
		submit SIMULATION;
			%put Iteration=&SIMULATION;
		endsubmit;
		MATRIX=i(&SIZE.);
		do I=2 to &SIZE.;
			do J=1 to I-1;
				MATRIX[I,J]=ranbin(1,1,0.015);
				MATRIX[J,I]=MATRIX[I,J];
			end;
		end;
		if det(MATRIX)=0 then INVERSE=ginv(MATRIX);
		else INVERSE=inv(MATRIX);
		SOLUTION=INVERSE*j(&SIZE.,1,1);
	end;
quit;

/**************************************************
3. Rank Instead of Determinant
 - As Rick mentioned, altered DET by RANK
 - DET is computationally poor
 - RANK works well in checking singularity
 - This code does not stop during the iteration
 - Takes 9.31 secs (real time) in my laptop
 ★ Seems that DET is way cheaper computationally
**************************************************/
proc iml;
	do SIMULATION=1 to &NOFSIMULATION.;
		submit SIMULATION;
			%put Iteration=&SIMULATION;
		endsubmit;
		MATRIX=i(&SIZE.);
		do I=2 to &SIZE.;
			do J=1 to I-1;
				MATRIX[I,J]=ranbin(1,1,0.015);
				MATRIX[J,I]=MATRIX[I,J];
			end;
		end;
		if rank(MATRIX)=&SIZE. then INVERSE=inv(MATRIX);
		else INVERSE=ginv(MATRIX);
		SOLUTION=INVERSE*j(&SIZE.,1,1);
	end;
quit;

/**************************************************
4. SOLVE Instead of INV
 - As Rick mentioned, altered INV by SOLVE
 - SOLVE is numerically faster
 - INV is more general but relatively slower
 - b=inv(A)*c and b=solve(A,c) are equivalent
 - (if A is invertible)
 - This code does not stop during the iteration
 - Takes 2.42 secs (real time) in my laptop
 ★ Seems that SOLVE is computationally cheaper
**************************************************/
proc iml;
	do SIMULATION=1 to &NOFSIMULATION.;
		submit SIMULATION;
			%put Iteration=&SIMULATION;
		endsubmit;
		MATRIX=i(&SIZE.);
		do I=2 to &SIZE.;
			do J=1 to I-1;
				MATRIX[I,J]=ranbin(1,1,0.015);
				MATRIX[J,I]=MATRIX[I,J];
			end;
		end;
		if det(MATRIX)=0 then SOLUTION=ginv(MATRIX)*j(&SIZE.,1,1);
		else SOLUTION=solve(MATRIX,j(&SIZE.,1,1));
	end;
quit;

/**************************************************
5. Cholesky Decomposition Using ROOT
 - As Rick mentioned, utilized ROOT
 - ROOT computes a Cholesky upper triangular
 - ROOT is the fastest when MATRIX is symmetric
 - Both inv(A) and root(A)*root(A)` are the same
 - This code stops at the 1st iteration
 ★ ROOT requires MATRIX to be positive-definite
**************************************************/
proc iml;
	do SIMULATION=1 to &NOFSIMULATION.;
		submit SIMULATION;
			%put Iteration=&SIMULATION;
		endsubmit;
		MATRIX=i(&SIZE.);
		do I=2 to &SIZE.;
			do J=1 to I-1;
				MATRIX[I,J]=ranbin(1,1,0.015);
				MATRIX[J,I]=MATRIX[I,J];
			end;
		end;
		if det(MATRIX)=0 then SOLUTION=ginv(MATRIX)*j(&SIZE.,1,1);
		else do;
			INVU=inv(root(MATRIX));
			INVERSE=INVU*INVU`;
			SOLUTION=INVERSE*j(&SIZE.,1,1);
		end;
	end;
quit;

This code includes five different ways. The purpose is to invert MATRIX, which is random and symmetric but singular sometimes.

1. Use just a INV function (aborts at the 24th iteration because MATRIX is singular)
2. Check using DET and GINV if singular (does not abort, takes about 3 seconds in my laptop)
3. Check using RANK and GINV if rank-deficient (does not abort, but takes about 9 seconds)
4. Check using DET and SOLVE if invertible (does not abort, takes about 2.5 seconds)
5. Check using DET and ROOT if invertible (aborts at the 1st iteration because MATRIX is not positive-definite)

Here is my conclusion after all things considered.

1. It seems SOLVE rather than INV is computationally cheaper slightly if I do not need the inverse matrix per se.
2. However, RANK is way too much expensive computationally rather than DET.
3. Though RANK is expensive, it may be still useful because DET is impossible if MATRIX is large enough.
(SAS shows the message: ERROR: (execution) Unable to allocate sufficient memory. At least 2147483647 more bytes required)
4. ROOT is not applicable because it requires more than symmetriness, i.e. positive-definiteness.

Though your answer is still valuable for me, my question is still unanswered. Can I write a following code in SAS?

1. Invert MATRIX using inv(MATRIX). Just go to the next step if it is successful and skip Step 2.
2. Invert MATRIX using ginv(MATRIX) only if Step 1 is not successful.

Now my IML code requires one more step, i.e. checking if MATRIX is singular (using DET or RANK). SAS computes both DET and INV in every iteration, but I only need INV. DET is what I do not want. SOLVE is good, but I must use a pseudoinverse if MATRIX is singular.

Ksharp
Super User
@Rick_SAS has already written a blog about it by using Hilbert Matrix. Search Hilbert Matrix at Rick's blog.
But I would like to use function ECHELON().

proc iml;
a = {3 6 9,
1 2 5,
2 4 10 };
e = echelon(a);
print e;
quit;

If E have a 0 in diagonal element, then E is singularity.

Rick_SAS
SAS Super FREQ

I'd still like to know what you are trying to accomplish with your simulation. For example, I have simulated the distribution of eigenvalues for random orthogonal matrices, and the distribution of eigenvalues for random symmetric matrices where each element is from a given distribution. In your example, it looks like you are using random symmetric 0/1 matrices, but it is not clear what you are trying to find out about this population of matrices. There have been many mathematical papers written about the singular varieties of (0,1) matrices; some might contains mathematical facts that might be helpful in your investigation.

 

To answer your question, the SAS/IML documentation shows how you can use the CALL PUSH statement and the RESUME statement to handle errors in SAS/IML modules. One of the examples shows how to recover from an error when you call the INV function on a singular matrix. Here is a little SAS/IML module that demonstrates the technique:

 

proc iml;
/* EXtended version of the INV function. If the matrix is not 
   invertible then return a missing value. Otherwise return the 
   inverse. See the IML doc :  
   http://support.sas.com/documentation/cdl/en/imlug/68150/HTML/default/viewer.htm#imlug_genstmts_sect010.htm
*/
start InvEx(A); 
   on_error = "if error then do;  AInv = .;  resume; end;";
   call push(on_error);   /* PUSH code that will be executed if an error occurs */
   error = 1;             /* set flag before calling INV */
   AInv = inv(A);
   error = 0;             /* remove flag after calling INV */
   return ( AInv );
finish;

A1 = {1 1,
      0 1};
B1 = InvEx(A1);
A2 = {1 1,
      1 1};
B2 = InvEx(A2);

print B1, B2;

 

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

From The DO Loop
Want more? Visit our blog for more articles like these.
Discussion stats
  • 6 replies
  • 2577 views
  • 7 likes
  • 4 in conversation