## using lag function in a do loop

Solved
Occasional Contributor
Posts: 5

# using lag function in a do loop

Hi,

I am using a do loop to create lag variable. The point is I have to use the last observation's data to compute the lagged variable for the next observation. I try the  following:

data data.coffee;
set data.coffee;
by PANID;
if first.PANID then NO = 0;
NO + 1;
array brandloy(6) brandloy1 - brandloy6;
array lag_brandloy(6) lag_brandloy1 - lag_brandloy6;
do i = 1 to 6;
if NO = 1 then brandloy(i) = 0.04;
if NO = 1 and brandID = i then brandloy(i) = 0.8;
do k = 2 to 160;
lag_brandloy(i) = lag(brandloy(i));
if NO = k and brandloy(i)= . then brandloy(i) = sum(0.8*lag_brandloy(i),0.2*brand_lastindicator);
end;
end;
run;

But the result is not what I want.

The lag does not work here.

If I do it one by one like this:

if NO = 2 then brandloy(i) = 0.8*lag_brandloy(i) + 0.2*brand_lastindicator;
lag_brandloy(i) = lag(brandloy(i));
if NO = 3 then brandloy(i) = 0.8*lag_brandloy(i) + 0.2*brand_lastindicator;
lag_brandloy(i) = lag(brandloy(i));
if NO = 4 then brandloy(i) = 0.8*lag_brandloy(i) + 0.2*brand_lastindicator;
lag_brandloy(i) = lag(brandloy(i));

the result is correct. So I wonder why the loop code does not work. And any suggestions about what I should do?

Thanks!

Accepted Solutions
Solution
‎02-25-2017 07:33 AM
Super User
Posts: 6,775

## Re: using lag function in a do loop

As you have seen, the LAG function is more complex than it looks.  For your application, I would recommend skipping it entirely.  You can still get the results you want through this combination of steps:

(a) compute the lagged values before computing the unlagged values

(b) retain the unlagged values

For example:

data data.coffee;

set data.coffee;

by panid;

if first.PANID then NO=0;

NO + 1;

array brandloy (6) brandloy1 - brandloy6;

array lag_brandloy (6) lag_brandloy1-lag_brandloy_6;

So far, no changes.  But now the idea is this:

retain brandloy1 - brandloy6;

if first.panid=0 then do i=1 to 6;

lag_brandloy(i) = brandloy(i);

end;

do i=1 to 6;

** Compute only the brandloy values, not the lag_brandloy values;

end;

run;

You still need to fill  in the details ... computing brandloy values, when to compute lag_brandloy values, etc.  But this approach will get around the complexities of the LAG function.

All Replies
Posts: 1,337

## Re: using lag function in a do loop

You have shown output data that you don't want. But you haven't shown (1) sampel input data, (2) output data you DO want.

Make your objectives clear - show the input data, and desired output data.

Occasional Contributor
Posts: 5

## Re: using lag function in a do loop

The input data:

brand_lastindicator

0

1

0

0

1

1

...

My desired output data(which can be realized by code them one by one)

if No=3 and brandloy(i)= . then brandloy(i) = sum(0.8*lag_brandloy(i),0.2*brand_lastindicator);
lag_brandloy(i) = lag(brandloy(i));
if No=3 and brandloy(i)= . then brandloy(i) = sum(0.8*lag_brandloy(i),0.2*brand_lastindicator);
lag_brandloy(i) = lag(brandloy(i));
if No=4 and brandloy(i)= . then brandloy(i) = sum(0.8*lag_brandloy(i),0.2*brand_lastindicator);
lag_brandloy(i) = lag(brandloy(i));
if No=5 and brandloy(i)= . then brandloy(i) = sum(0.8*lag_brandloy(i),0.2*brand_lastindicator);

I want to do the same thing from No=2 to 160.

Posts: 1,337

## Re: using lag function in a do loop

I do not see the variables PANID, BRANDLOY1, ... BRANDLOY6 in your sample input data.  Can you provide some actual data?

Occasional Contributor
Posts: 5

## Re: using lag function in a do loop

Here is a data sample

Occasional Contributor
Posts: 5

## Re: using lag function in a do loop

I don't have the brandloy1-brandloy6 at first, I need to compute them.

For each PANID, I have to compute the brandloy(i)  for the first observation, then the second and so on.

Solution
‎02-25-2017 07:33 AM
Super User
Posts: 6,775

## Re: using lag function in a do loop

As you have seen, the LAG function is more complex than it looks.  For your application, I would recommend skipping it entirely.  You can still get the results you want through this combination of steps:

(a) compute the lagged values before computing the unlagged values

(b) retain the unlagged values

For example:

data data.coffee;

set data.coffee;

by panid;

if first.PANID then NO=0;

NO + 1;

array brandloy (6) brandloy1 - brandloy6;

array lag_brandloy (6) lag_brandloy1-lag_brandloy_6;

So far, no changes.  But now the idea is this:

retain brandloy1 - brandloy6;

if first.panid=0 then do i=1 to 6;

lag_brandloy(i) = brandloy(i);

end;

do i=1 to 6;

** Compute only the brandloy values, not the lag_brandloy values;

end;

run;

You still need to fill  in the details ... computing brandloy values, when to compute lag_brandloy values, etc.  But this approach will get around the complexities of the LAG function.

Occasional Contributor
Posts: 5

Many thanks!!
Super User
Posts: 8,111

## Re: using lag function in a do loop

The way the LAG<n>() functions work you need to have a separate statement for each stack of lagged values you want to create.

data have ;
input x @@ ;
x_lag1 = lag1(x);
x_lag2 = lag2(x);
x_lag3 = lag3(x);
x_lag4 = lag4(x);
x_lag5 = lag5(x);
x_lag6 = lag6(x);
cards;
0 1 0 0 1 1 0 1 0 1
;

You could generate this repetitive code using a macro if you want.

%macro genlag(var,n);
%local i;
%do i=1 %to &n ;
&var._lag&i = lag&i(&var);
%end;
%mend ;

data want ;
set have ;
%genlag(x,6);
run;
☑ This topic is solved.