BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
robynn
Calcite | Level 5

Hi,

I am using a do loop to create lag variable. The point is I have to use the last observation's data to compute the lagged variable for the next observation. I try the  following:

data data.coffee;
set data.coffee;
by PANID;
if first.PANID then NO = 0;
NO + 1;
array brandloy(6) brandloy1 - brandloy6;
array lag_brandloy(6) lag_brandloy1 - lag_brandloy6;
do i = 1 to 6;
if NO = 1 then brandloy(i) = 0.04;
if NO = 1 and brandID = i then brandloy(i) = 0.8;
do k = 2 to 160;
lag_brandloy(i) = lag(brandloy(i));
if NO = k and brandloy(i)= . then brandloy(i) = sum(0.8*lag_brandloy(i),0.2*brand_lastindicator);
end;
end;
run;

 

 

But the result is not what I want. 

lag.png

The lag does not work here.

 

 

If I do it one by one like this:

 

if NO = 2 then brandloy(i) = 0.8*lag_brandloy(i) + 0.2*brand_lastindicator;
lag_brandloy(i) = lag(brandloy(i));
if NO = 3 then brandloy(i) = 0.8*lag_brandloy(i) + 0.2*brand_lastindicator;
lag_brandloy(i) = lag(brandloy(i));
if NO = 4 then brandloy(i) = 0.8*lag_brandloy(i) + 0.2*brand_lastindicator;
lag_brandloy(i) = lag(brandloy(i));

the result is correct. So I wonder why the loop code does not work. And any suggestions about what I should do?

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

As you have seen, the LAG function is more complex than it looks.  For your application, I would recommend skipping it entirely.  You can still get the results you want through this combination of steps:

 

(a) compute the lagged values before computing the unlagged values

(b) retain the unlagged values

 

For example:

 

data data.coffee;

set data.coffee;

by panid;

if first.PANID then NO=0;

NO + 1;

array brandloy (6) brandloy1 - brandloy6;

array lag_brandloy (6) lag_brandloy1-lag_brandloy_6;

 

So far, no changes.  But now the idea is this:

 

retain brandloy1 - brandloy6;

if first.panid=0 then do i=1 to 6;

   lag_brandloy(i) = brandloy(i);

end;

do i=1 to 6;

   ** Compute only the brandloy values, not the lag_brandloy values;

end;

run;

 

You still need to fill  in the details ... computing brandloy values, when to compute lag_brandloy values, etc.  But this approach will get around the complexities of the LAG function.

View solution in original post

8 REPLIES 8
mkeintz
PROC Star

You have shown output data that you don't want. But you haven't shown (1) sampel input data, (2) output data you DO want.

 

Make your objectives clear - show the input data, and desired output data.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
robynn
Calcite | Level 5

Thanks for your reply!

The input data:

brand_lastindicator

0

1

0

0

1

1

...

My desired output data(which can be realized by code them one by one)

lag 1.png

 

if No=3 and brandloy(i)= . then brandloy(i) = sum(0.8*lag_brandloy(i),0.2*brand_lastindicator);
lag_brandloy(i) = lag(brandloy(i));
if No=3 and brandloy(i)= . then brandloy(i) = sum(0.8*lag_brandloy(i),0.2*brand_lastindicator);
lag_brandloy(i) = lag(brandloy(i));
if No=4 and brandloy(i)= . then brandloy(i) = sum(0.8*lag_brandloy(i),0.2*brand_lastindicator);
lag_brandloy(i) = lag(brandloy(i));
if No=5 and brandloy(i)= . then brandloy(i) = sum(0.8*lag_brandloy(i),0.2*brand_lastindicator);

I want to do the same thing from No=2 to 160. 

 

mkeintz
PROC Star

I do not see the variables PANID, BRANDLOY1, ... BRANDLOY6 in your sample input data.  Can you provide some actual data?

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
robynn
Calcite | Level 5

Here is a data sample

robynn
Calcite | Level 5

I don't have the brandloy1-brandloy6 at first, I need to compute them.

For each PANID, I have to compute the brandloy(i)  for the first observation, then the second and so on.

Astounding
PROC Star

As you have seen, the LAG function is more complex than it looks.  For your application, I would recommend skipping it entirely.  You can still get the results you want through this combination of steps:

 

(a) compute the lagged values before computing the unlagged values

(b) retain the unlagged values

 

For example:

 

data data.coffee;

set data.coffee;

by panid;

if first.PANID then NO=0;

NO + 1;

array brandloy (6) brandloy1 - brandloy6;

array lag_brandloy (6) lag_brandloy1-lag_brandloy_6;

 

So far, no changes.  But now the idea is this:

 

retain brandloy1 - brandloy6;

if first.panid=0 then do i=1 to 6;

   lag_brandloy(i) = brandloy(i);

end;

do i=1 to 6;

   ** Compute only the brandloy values, not the lag_brandloy values;

end;

run;

 

You still need to fill  in the details ... computing brandloy values, when to compute lag_brandloy values, etc.  But this approach will get around the complexities of the LAG function.

robynn
Calcite | Level 5
Many thanks!!
Tom
Super User Tom
Super User

The way the LAG<n>() functions work you need to have a separate statement for each stack of lagged values you want to create.

data have ;
  input x @@ ;
  x_lag1 = lag1(x);
  x_lag2 = lag2(x);
  x_lag3 = lag3(x);
  x_lag4 = lag4(x);
  x_lag5 = lag5(x);
  x_lag6 = lag6(x);
cards;
0 1 0 0 1 1 0 1 0 1
;

You could generate this repetitive code using a macro if you want.

%macro genlag(var,n);
%local i;
%do i=1 %to &n ;
&var._lag&i = lag&i(&var);
%end;
%mend ;

data want ;
  set have ;
  %genlag(x,6);
run;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 5736 views
  • 2 likes
  • 4 in conversation