I am a mathematical statistics masters student at the University of Pretoria, South-Africa.
The SWEEP function in SAS IML works very well to calculate the G1 generalized inverse of a large matrix. I need to calculate the G1 generalized inverse (only satisfying the condition AGA = A, not the Moore-Penrose inverse) of a large matrix using the SWEEP operator. I want to program the sweep operator.
The matrices I need to calculate the G1 generalized inverse of is very large, dimension approximately 3000 x 3000.
I have tried to program the sweep operator to calculate the G1 generalized inverse using the following article as a reference
"A Tutorial on the SWEEP Operator" Author: James H. Goodnight.
My program is not generating the correct G1 generalized inverse.
I also want to make the program efficient.
Can you please guide me and give me advice on how to program the sweep operator?
Is it wise to use a loop in the program?
ginv(A) = SWEEP(A,1:nrow(A))
The sweep operator applied to all the rows/columns of the matrix = G1 generalized inverse
Is there some trick that can help me make my program more efficient? Since I am working with very large matrices.
proc iml;
/*SWEEP operator to calculate G1 - first attempt*/
start sweep_ginv(A);
n = nrow(A);
do k = 1 to n;
B = J(n,n,0);
print A;
p = A[k,k];
B[setdif(1:n,k),k] = -A[setdif(1:n,k),k]/p;
B[k,setdif(1:n,k)] = A[k,setdif(1:n,k)]/p;
B[setdif(1:n,k),setdif(1:n,k)] = A[setdif(1:n,k),setdif(1:n,k)]-(1/p)*A[setdif(1:n,k),k]*A[k,setdif(1:n,k)];
B[k,k] = 1/p;
print B;
A = B;
print A;
end;
return(B);
finish sweep_ginv;
***********************************************************************;
/*Second attempt*/
start sweep_ginv2(F);
n = nrow(F);
range = 1:n;
do k = 1 to n;
B = J(n,n,0);
p = F[k,k];
B[loc(range ^= k),k] = -F[loc(range ^= k),k]/p;
B[k,loc(range ^= k)] = F[k,loc(range ^= k)]/p;
B[loc(range ^= k),loc(range ^= k)] = F[loc(range ^= k),loc(range ^= k)]-(1/p)*F[loc(range ^= k),k]*F[k,loc(range ^= k)];
B[k,k] = 1/p;
F = B;
end;
return(B);
finish sweep_ginv2;
******************************************************************;
/*Test for small matrices first*/
seed = 1234;
n = 40;
c = J(n, n, seed);
A = normal(c);
*******************************************************************;
/*Test for generalized inverse
the sum should be zero*/
T = sweep_ginv2(A);
sum = sum(A*T*A - A);
print sum;
quit;