BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
robynn
Calcite | Level 5

Hi,

I am using a do loop to create lag variable. The point is I have to use the last observation's data to compute the lagged variable for the next observation. I try the  following:

data data.coffee;
set data.coffee;
by PANID;
if first.PANID then NO = 0;
NO + 1;
array brandloy(6) brandloy1 - brandloy6;
array lag_brandloy(6) lag_brandloy1 - lag_brandloy6;
do i = 1 to 6;
if NO = 1 then brandloy(i) = 0.04;
if NO = 1 and brandID = i then brandloy(i) = 0.8;
do k = 2 to 160;
lag_brandloy(i) = lag(brandloy(i));
if NO = k and brandloy(i)= . then brandloy(i) = sum(0.8*lag_brandloy(i),0.2*brand_lastindicator);
end;
end;
run;

 

 

But the result is not what I want. 

lag.png

The lag does not work here.

 

 

If I do it one by one like this:

 

if NO = 2 then brandloy(i) = 0.8*lag_brandloy(i) + 0.2*brand_lastindicator;
lag_brandloy(i) = lag(brandloy(i));
if NO = 3 then brandloy(i) = 0.8*lag_brandloy(i) + 0.2*brand_lastindicator;
lag_brandloy(i) = lag(brandloy(i));
if NO = 4 then brandloy(i) = 0.8*lag_brandloy(i) + 0.2*brand_lastindicator;
lag_brandloy(i) = lag(brandloy(i));

the result is correct. So I wonder why the loop code does not work. And any suggestions about what I should do?

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

As you have seen, the LAG function is more complex than it looks.  For your application, I would recommend skipping it entirely.  You can still get the results you want through this combination of steps:

 

(a) compute the lagged values before computing the unlagged values

(b) retain the unlagged values

 

For example:

 

data data.coffee;

set data.coffee;

by panid;

if first.PANID then NO=0;

NO + 1;

array brandloy (6) brandloy1 - brandloy6;

array lag_brandloy (6) lag_brandloy1-lag_brandloy_6;

 

So far, no changes.  But now the idea is this:

 

retain brandloy1 - brandloy6;

if first.panid=0 then do i=1 to 6;

   lag_brandloy(i) = brandloy(i);

end;

do i=1 to 6;

   ** Compute only the brandloy values, not the lag_brandloy values;

end;

run;

 

You still need to fill  in the details ... computing brandloy values, when to compute lag_brandloy values, etc.  But this approach will get around the complexities of the LAG function.

View solution in original post

8 REPLIES 8
mkeintz
PROC Star

You have shown output data that you don't want. But you haven't shown (1) sampel input data, (2) output data you DO want.

 

Make your objectives clear - show the input data, and desired output data.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
robynn
Calcite | Level 5

Thanks for your reply!

The input data:

brand_lastindicator

0

1

0

0

1

1

...

My desired output data(which can be realized by code them one by one)

lag 1.png

 

if No=3 and brandloy(i)= . then brandloy(i) = sum(0.8*lag_brandloy(i),0.2*brand_lastindicator);
lag_brandloy(i) = lag(brandloy(i));
if No=3 and brandloy(i)= . then brandloy(i) = sum(0.8*lag_brandloy(i),0.2*brand_lastindicator);
lag_brandloy(i) = lag(brandloy(i));
if No=4 and brandloy(i)= . then brandloy(i) = sum(0.8*lag_brandloy(i),0.2*brand_lastindicator);
lag_brandloy(i) = lag(brandloy(i));
if No=5 and brandloy(i)= . then brandloy(i) = sum(0.8*lag_brandloy(i),0.2*brand_lastindicator);

I want to do the same thing from No=2 to 160. 

 

mkeintz
PROC Star

I do not see the variables PANID, BRANDLOY1, ... BRANDLOY6 in your sample input data.  Can you provide some actual data?

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
robynn
Calcite | Level 5

Here is a data sample

robynn
Calcite | Level 5

I don't have the brandloy1-brandloy6 at first, I need to compute them.

For each PANID, I have to compute the brandloy(i)  for the first observation, then the second and so on.

Astounding
PROC Star

As you have seen, the LAG function is more complex than it looks.  For your application, I would recommend skipping it entirely.  You can still get the results you want through this combination of steps:

 

(a) compute the lagged values before computing the unlagged values

(b) retain the unlagged values

 

For example:

 

data data.coffee;

set data.coffee;

by panid;

if first.PANID then NO=0;

NO + 1;

array brandloy (6) brandloy1 - brandloy6;

array lag_brandloy (6) lag_brandloy1-lag_brandloy_6;

 

So far, no changes.  But now the idea is this:

 

retain brandloy1 - brandloy6;

if first.panid=0 then do i=1 to 6;

   lag_brandloy(i) = brandloy(i);

end;

do i=1 to 6;

   ** Compute only the brandloy values, not the lag_brandloy values;

end;

run;

 

You still need to fill  in the details ... computing brandloy values, when to compute lag_brandloy values, etc.  But this approach will get around the complexities of the LAG function.

robynn
Calcite | Level 5
Many thanks!!
Tom
Super User Tom
Super User

The way the LAG<n>() functions work you need to have a separate statement for each stack of lagged values you want to create.

data have ;
  input x @@ ;
  x_lag1 = lag1(x);
  x_lag2 = lag2(x);
  x_lag3 = lag3(x);
  x_lag4 = lag4(x);
  x_lag5 = lag5(x);
  x_lag6 = lag6(x);
cards;
0 1 0 0 1 1 0 1 0 1
;

You could generate this repetitive code using a macro if you want.

%macro genlag(var,n);
%local i;
%do i=1 %to &n ;
&var._lag&i = lag&i(&var);
%end;
%mend ;

data want ;
  set have ;
  %genlag(x,6);
run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 5657 views
  • 2 likes
  • 4 in conversation