Hello all,
I simulate time series model and in some point the simulation stops
and I get the message:
ERROR: Overflow error in
ERROR: Matrix should be non-singular.
so I don't get the full simulated data.
I saw in SAS documentation that this problem is a known one in MCMC procedure (SAS/STAT(R) 9.22 User's Guide
I simulate in SAS/IML and not in these procedure but the idea is close.
In some point it can be seen in the results that there is a huge or really small simulated number.
I'll appreciate Ideas to solve the problem,
Thank you,
Orit
I enclose my program:
proc iml;
p = 1:240; /* 1 x 240 matrix, time trend */
tr = shape(p, 240, 1);/* 240 x 1 matrix, time trend */
tr2=tr##2;
n=10000;
b=j(8,n);
Ymean=j(n,1);
Ystd=j(n,1);
SLmean=j(n,1);
SLstd=j(n,1);
TLmean=j(n,1);
TLstd=j(n,1);
do j=1 to n;
xtl=j(240, 1);
xsl=j(240, 1);
y=j(240, 1);
xtl[1,1]=-0.020033;
xsl[1,1] = 6.25185;
y[1,1] =-0.064516129;
a=0.25134 ;*first regression*;
k=0.40409;
ka=-0.01867;
kb=1.83886;
kc=-0.00105 ;
kd=-73.21356;
f=-0.00327;
g=0.00000900;
c=0.05664; *second regression*;
e=0.09335;
d=0.94819;
h=-1.81474;
n=-0.00073982;
m=0.00183; *third regression*;
l=0.00132;
q=-0.00046477;
r=0.99164;
t=-0.00001458;
/** specify the mean and covariance of the population **/
Mean = {0, 0, 0};
Cov = {0.0046949688 0.0008310241 0.0000206459,
0.0008310241 0.0073176901 0.0000450616,
0.0000206459 0.0000450616 0.0000021738};
NumSamples = 240;
call randseed(0); /** set seed for the RandNormal module **/
U = RandNormal(NumSamples, Mean, Cov);
do i=2 to 240;
Y[i,1]=a+k*y[(i-1),1]+ka*xsl[(i-1),1]+kb*xtl[(i-1),1]+kc*xsl[(i-1),1]##2+kd*xtl[(i-1),1]##2+f*i+g*i*i+u[i,1];
Y[i,1]=round(Y[i,1],0.00001);
xsl[i,1]=c+e*y[(i-1),1]+d*xsl[(i-1),1]+h*xtl[(i-1),1]+u[i,2]+n*i;
xsl[i,1]=round(xsl[i,1],0.00001);
xtl[i,1]=m+l*y[(i-1),1]+q*xsl[(i-1),1]+r*xtl[(i-1),1]+u[i,3]+t*i;
xtl[i,1]=round( xtl[i,1],0.00001);
end;
ylag=lag(y);
xsl2=xsl##2;
xtl2=xtl##2;
/* Step 1: Compute X`X and X`y */
x = j(nrow(y), 1, 1) || ylag || xsl|| xtl || xsl2||xtl2||tr||tr2; /* add intercept column */
v = {1}; /* specify rows to exclude */
idx = setdif(1:nrow(x), v); /* start with 1:240, exclude values in v */
xnew = x[idx, ]; /* extract submatrix */
xpx = xnew`* xnew; /* cross-products */
v = {1}; /* specify rows to exclude */
idy = setdif(1:nrow(y), v); /* start with 1:5, exclude values in v */
ynew = y[idx, ]; /* extract submatrix*/
xpy = xnew` * ynew;
b[,j]= solve(xpx, xpy);
Ymean[j,]=mean(y);
Ystd[j,]=std(y);
SLmean[j,]=mean(xsl);
SLstd[j,]=std(xsl);
TLmean[j,]=mean(xtl);
TLstd[j,]=std(xtl);
end;
bt=b`;
varNames = "b0":"b7";
experiment="Sim1":"Sim10000";
print bt[colname=varNames
rowname=experiment
label="Expanded Model"];
print Ymean Ystd SLmean SLstd TLmean TLstd xsl2mean xsl2std xtl2mean xtl2std;
create Mydata var {Ymean Ystd SLmean SLstd TLmean TLstd};
append;
close Mydata;
quit;
I don't follow everything that your program does, but here is my conjecture:
Your population covariance matrix has an eigenvalue of 1.8e-6 (v=eigval(Cov);print v;), which means that it is very close to singular. Each simulated set of MV Normal data will have a sample covariance that is close to (but not equal to) the population covariance. That means that you are constantly generating data for which the third variable is almost exactly a linear combination of the first two variables. Eventually, you are hitting a singular condition.
I suspect that rounding the components greatly increases the chance of simulating degenerate data. To see if I'm right, try rounding to 0.01 and see if the singularity occurs more often. Try rounding to 1e-12 to see if the singularity occurs less often.
Rick, thank you for your heplful answer!!
and yes you're right.
Orit
Dear Rick,
I checked your hypothesis and I took a slight different model (just y is slight different)
with really close covariance matrix -with an eigenvalue of 1.8e-6
and the simulation worked well. I enclosed the program.
What do you think? do you have an idea how can it be? and why the first simulation does not work?
Thank you,
Orit
proc iml;
p = 1:240; /* 1 x 240 matrix, time trend */
tr = shape(p, 240, 1);/* 240 x 1 matrix, time trend */
tr2=tr##2;
*print tr,tr2;
n=10000;
b=j(8,n);
do j=1 to n; *Number of simulations;
xtl = j(240, 1);
xsl=j(240, 1);
y=j(240, 1);
xtl[1,1]=0.929247076;
xsl[1,1] = 9.066849783;
y[1,1] = 0.55;
a=-149.25336 ;*first regression*;
k=0.75322;
f=-0.00420;
g=0.00002401;
c=1.97709; *second regression*;
e=0.00982;
d=0.94927;
h=-1.88453;
n=-0.00072129;
m=0.01462; *third regression*;
l=0.00042239;
q=-0.00046726;
r=0.98752;
t=-0.00001485;
/** specify the mean and covariance of the population **/
Mean = {0, 0, 0};
Cov = {0.0450507141 0.0020509943 0.0000558774,
0.0020509943 0.0073514024 0.0000450563,
0.0000558774 0.0000450563 0.0000021530};
*v=eigval(Cov);*print v;
NumSamples = 240;
call randseed(0); /** set seed for the RandNormal module **/
U = RandNormal(NumSamples, Mean, Cov);
do i=2 to 240;
Y[i,1]=a+k*y[(i-1),1]+f*i+g*i*i+u[i,1];
xsl[i,1]=c+e*y[(i-1),1]+d*xsl[(i-1),1]+h*xtl[(i-1),1]+u[i,2]+n*i;
xtl[i,1]=m+l*y[(i-1),1]+q*xsl[(i-1),1]+r*xtl[(i-1),1]+u[i,3]+t*i;
end;
ylag=lag(y);
xsl2=xsl##2;
xtl2=xtl##2;
*print xsl xsl2;
/* Step 1: Compute X`X and X`y */
x = j(nrow(y), 1, 1) || ylag || xsl|| xtl || xsl2||xtl2||tr||tr2; /* add intercept column */
v = {1}; /* specify rows to exclude */
idx = setdif(1:nrow(x), v); /* start with 1:5, exclude values in v */
xnew = x[idx, ]; /* extract submatrix */
xpx = xnew`* xnew; /* cross-products */
v = {1}; /* specify rows to exclude */
idy = setdif(1:nrow(y), v); /* start with 1:5, exclude values in v */
ynew = y[idx, ]; /* extract submatrix*/
xpy = xnew` * ynew;
b[,j]= solve(xpx, xpy);
end;
bt=b`;
varNames = "b0":"b7";
experiment="Sim1":"Sim10000";
print bt[colname=varNames
rowname=experiment
label="Expanded Model"];
quit;
You didn't round, so the probability of a singular matrix is much lower. If you are good at math and numerical linear algebra, here is an article about generating random matrices and the probability of generating a singular matrix: What is the chance that a random matrix is singular? - The DO Loop
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.
Find more tutorials on the SAS Users YouTube channel.