DATA Step, Macro, Functions and more

Must I reset the value of lag() when using BY variables?

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 10
Accepted Solution

Must I reset the value of lag() when using BY variables?

I would like to count the number of runs in a dataset, but I believe that my code has a hole in it.

I had thought to use:

proc sort data=stuff; by key id;

  run;

data runs;

  set stuff; by key;

  retain runs;

  if first.key then runs=1;

  else if id ne lag(id) then runs+1;

  if last.key then output;

  run;

However, it appears that lag(id) is not set to missing when the BY variable key changes. Am I looking at this wrong? Is there a faster/better way to do it?

Thanks!


Accepted Solutions
Solution
‎08-13-2014 06:48 PM
Respected Advisor
Posts: 4,173

Re: Must I reset the value of lag() when using BY variables?

Never use lag() within an if condition as it only updates the lag value when executed - so the function needs to execute in every single iteration of the data step.

and as already posted: there is no need to use lag()

View solution in original post


All Replies
Respected Advisor
Posts: 3,799

Re: Must I reset the value of lag() when using BY variables?

Posted in reply to dan_waldo

You don't need lag.

BY KEY ID;

if first.key then runs=0;

if first.id then runs+1;

if last.key then output;

Trusted Advisor
Posts: 1,231

Re: Must I reset the value of lag() when using BY variables?

Posted in reply to dan_waldo

Just need a slight change

data runs;

set stuff;

by key;

delta=id-lag(id);

if first.key then runs=1;

if not first.key then do;

if delta ne 0 then runs+1;

end;

if last.key then output;

run;

Solution
‎08-13-2014 06:48 PM
Respected Advisor
Posts: 4,173

Re: Must I reset the value of lag() when using BY variables?

Never use lag() within an if condition as it only updates the lag value when executed - so the function needs to execute in every single iteration of the data step.

and as already posted: there is no need to use lag()

SAS Super FREQ
Posts: 8,869

Re: Must I reset the value of lag() when using BY variables?

Hi:

  Patrick is correct. Please refer to this Tech Support note that demonstrates that using lag conditionally will probably result in undesired results: 24665 - Use the LAG function to conditionally carry information down a data set

  If you run the code in that note, the approach that creates variable is the approach to use. As the note indicates (in the code), the LAG function stores values on the queue only when it is called, so you must call LAG unconditionally (outside the IF condition) to get the correct lagged value. And this paper has a longer explanation of the usage http://support.sas.com/resources/papers/proceedings09/055-2009.pdf.

cynthia

Respected Advisor
Posts: 3,156

Re: Must I reset the value of lag() when using BY variables?

At the risk of being obnoxious, I dare say 'never say never' Smiley Happy. I can't think of any practical usage of conditional lag(), except I have had an rather unique request from a customer. say I have some balance record for checking accounts, they want to know the amount of the LAST over-draft if there is an existing one. Of course, there are many ways to do it, but lag() is also handy. 

data have;

     input balance @@;

     cards;

100 -120 200 210 25 -35 400 100 20 -200

;

run;

data want;

     set have;

     if balance <0 then

           last_overdraft=lag(balance);

run;

I guess the take-home message is that Being extremely cautious when using lag() under condition, use it only if you know what you are expecting.

My2cents,

Haikuo

Trusted Advisor
Posts: 1,231

Re: Must I reset the value of lag() when using BY variables?

Posted in reply to dan_waldo

@Patrick - Thanks so much for your input on lag function. This is very helpful!

Occasional Contributor
Posts: 10

Re: Must I reset the value of lag() when using BY variables?

Posted in reply to dan_waldo

Thanks to everybody who replied -- this is very useful information.

Community Manager
Posts: 2,956

Re: Must I reset the value of lag() when using BY variables?

Posted in reply to dan_waldo

And just to pile on with more LAG anecdotes and literature, I offer this blog post:

Pitfalls of the LAG function - The SAS Dummy

Echoing the other responses: if thinking through LAG makes your head hurt (as it does mine) and you can avoid using it...do.  FIRST-dot and LAST-dot are powerful DATA step features that can solve many scenarios with elegance.

Chris

Trusted Advisor
Posts: 3,215

Re: Must I reset the value of lag() when using BY variables?

Posted in reply to dan_waldo

No need for headaches for using lag it is more the common queue function  http://en.wikipedia.org/wiki/Queue_(abstract_data_type)

With the implementation of that you see some problems as of the pull/push and sizing. SAS development could have done this in a better way.   

---->-- ja karman --<-----
🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 1246 views
  • 9 likes
  • 8 in conversation