## Vectorizing Rather Than Looping in IML

I have a time-series variable x as follows.

``````data series;
do t=1 to 5000;
x=rannor(1);
output;
end;
run;``````

For each t, I want to estimate x's second moment using observations from 1 to t and a normal kernel as follows.

``````proc iml;
use series;
close;
s2=1:5000;
do s=1 to 5000;
s2[s]=pdf("norm",1:s,s,100)*x[1:s]##2/sum(pdf("norm",1:s,s,100));
end;
create series var{t x s2};
append;
close;
quit;``````

So I load series and apply this recursive loop since the second moment at t requires observations from 1 to t (x[1:s]) as well as the kernel weighting information (pdf("norm",1:s,s,100)). Is there any way to wisely alter this do loop by its equivalent vectorized method? This way takes too much time for a large T>5000. Thanks.

1 ACCEPTED SOLUTION

Accepted Solutions

## Re: Vectorizing Rather Than Looping in IML

It is possible to represent these operations in terms of matrix multiplication, but if you have N observations the matrix will be NxN, which will get big.  I suspect the main performance bottleneck is not the loop but the fact that you are evaluating the normal PDF so many times. For N observations, you are evaluating the PDF N(N+1)/2 times.  This is unnecessary. You only need to evaluate the PDF at N locations. You can then use the properties of the normal PDF to reuse the values. (You also can compute x^2 once instead of multiple times.)

Let f be the normal PDF with standard deviation sigma=100. let f(x; mu) be the PDF for N(mu, 100) evaluated at x.

You are currently evaluating

s=1   f(1; 1)

s=2   f(1; 2)  f(2; 2)

s=3   f(1; 3)  f(2; 3)  f(3; 3)

s=4   f(1; 4)  f(2; 4)  f(3; 4)  f(4; 4)

...

s=N   f(1; N)  f(2; N)  f(3; N)  ...     f(N; N)

However, notice that f(1; 1) = f(2; 2) = f(3; 3) = .... = f(N; N).

Similarly, f(1; 2) = f(2; 3) = .... = f(N-1; N).

and f(1; 3) = f(2; 4) = ... = f(N-2; N).

In general, the triangular matrix above is banded because the values depend only on the distance between x and N.

If you reuse the PDF values, you can save a lot of time. On my computer, the original loop for N=8000 takes 3.4 seconds whereas the improved loop takes 0.2 seconds, (Only 5% of the original time.)

``````%let N = 8000;

data series;
do t=1 to &N;
x=rannor(1);
output;
end;
run;

proc iml;
use series;
close;

x2 = x##2;      /* form x^2 one time */

/* original loop */
t0 = time();
s2 = j(1, &N, .);
do s=1 to &N;
w = pdf("norm",1:s,s,100);
s2[s] = w * x2[1:s] / sum(w);
end;
tLoop = time() - t0;
s0 = s2;     /* save the oringal values */

/* new loop */
t0 = time();
w0 = pdf("norm", 1:&N, &N, 100);  /* eval the PDF once */
do s = 1 to &N;
w = w0[ , (&N-s+1):&N];
s2[s] = w * x2[1:s] / sum(w);
end;
tVec = time() - t0;
print tLoop tVec;

print (max(abs(s0-s2)))[L="Max Diff"];  /* show that the answers are the same */
``````

## Re: Vectorizing Rather Than Looping in IML

It is possible to represent these operations in terms of matrix multiplication, but if you have N observations the matrix will be NxN, which will get big.  I suspect the main performance bottleneck is not the loop but the fact that you are evaluating the normal PDF so many times. For N observations, you are evaluating the PDF N(N+1)/2 times.  This is unnecessary. You only need to evaluate the PDF at N locations. You can then use the properties of the normal PDF to reuse the values. (You also can compute x^2 once instead of multiple times.)

Let f be the normal PDF with standard deviation sigma=100. let f(x; mu) be the PDF for N(mu, 100) evaluated at x.

You are currently evaluating

s=1   f(1; 1)

s=2   f(1; 2)  f(2; 2)

s=3   f(1; 3)  f(2; 3)  f(3; 3)

s=4   f(1; 4)  f(2; 4)  f(3; 4)  f(4; 4)

...

s=N   f(1; N)  f(2; N)  f(3; N)  ...     f(N; N)

However, notice that f(1; 1) = f(2; 2) = f(3; 3) = .... = f(N; N).

Similarly, f(1; 2) = f(2; 3) = .... = f(N-1; N).

and f(1; 3) = f(2; 4) = ... = f(N-2; N).

In general, the triangular matrix above is banded because the values depend only on the distance between x and N.

If you reuse the PDF values, you can save a lot of time. On my computer, the original loop for N=8000 takes 3.4 seconds whereas the improved loop takes 0.2 seconds, (Only 5% of the original time.)

``````%let N = 8000;

data series;
do t=1 to &N;
x=rannor(1);
output;
end;
run;

proc iml;
use series;
close;

x2 = x##2;      /* form x^2 one time */

/* original loop */
t0 = time();
s2 = j(1, &N, .);
do s=1 to &N;
w = pdf("norm",1:s,s,100);
s2[s] = w * x2[1:s] / sum(w);
end;
tLoop = time() - t0;
s0 = s2;     /* save the oringal values */

/* new loop */
t0 = time();
w0 = pdf("norm", 1:&N, &N, 100);  /* eval the PDF once */
do s = 1 to &N;
w = w0[ , (&N-s+1):&N];
s2[s] = w * x2[1:s] / sum(w);
end;
tVec = time() - t0;
print tLoop tVec;

print (max(abs(s0-s2)))[L="Max Diff"];  /* show that the answers are the same */
``````

From The DO Loop