BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
bruinsteph
Calcite | Level 5

I am wondering why these two statements give different results. The difference is where I set the If condition:

 

data dt;
set dt;
lyear = lag(year);
lfirm = lag(firm);
if lfirm=firm;
run;

 

data dt;
set dt;
lyear = lag(year);
if lfirm=firm then lfirm = lag(firm);
run;

1 ACCEPTED SOLUTION

Accepted Solutions
s_lassen
Meteorite | Level 14

The IF statement without THEN (IF <condition>) is equivalent to IF NOT <condition> THEN DELETE;

 

So, your first data step could also be written as

data dt;
  set dt;
  lyear = lag(year);
  lfirm = lag(firm);
  if lfirm ne firm then delete;
run;

The second data step is (as others have also remarked) using a very dangerous approach: calling a LAG function conditionally.

 

Here's what the LAG function does:

It has an internal queue. Not a stack as it was called by @ChrisNZ , values in a queue are FIFO (first in, first out), while values in a stack are LIFO (last in, first out). This makes no difference when using the basic LAG function (the queue has a length of one value only), but when using e.g. LAG2 or LAG3, the queue is longer. Sorry for being a bit pedantic...

 

Stuff only gets pushed into the queue when the LAG function is called.

 

So what happens in your second data step? The statement

if lfirm=firm then lfirm = lag(firm);

does not do anything meaningful in this context. LFIRM has not been set to a value, so it is missing to begin with. The LAG function will only be called when FIRM is also missing, meaning that all the LAG function does is pull the old missing value from the queue, and push a new missing value (the current value of FIRM) into the queue. This means that LFIRM will always be missing.

 

 

View solution in original post

4 REPLIES 4
SASKiwi
PROC Star

Never use the LAG function inside a condition - you will get unpredictable results. 

ChrisNZ
Tourmaline | Level 20

The lag function manages its stack when it's called.

If you don't call it for every observation (for example you only call it when a condition is true), you don't get every value.

This is might a desirable outcome, but most often it is not.

s_lassen
Meteorite | Level 14

The IF statement without THEN (IF <condition>) is equivalent to IF NOT <condition> THEN DELETE;

 

So, your first data step could also be written as

data dt;
  set dt;
  lyear = lag(year);
  lfirm = lag(firm);
  if lfirm ne firm then delete;
run;

The second data step is (as others have also remarked) using a very dangerous approach: calling a LAG function conditionally.

 

Here's what the LAG function does:

It has an internal queue. Not a stack as it was called by @ChrisNZ , values in a queue are FIFO (first in, first out), while values in a stack are LIFO (last in, first out). This makes no difference when using the basic LAG function (the queue has a length of one value only), but when using e.g. LAG2 or LAG3, the queue is longer. Sorry for being a bit pedantic...

 

Stuff only gets pushed into the queue when the LAG function is called.

 

So what happens in your second data step? The statement

if lfirm=firm then lfirm = lag(firm);

does not do anything meaningful in this context. LFIRM has not been set to a value, so it is missing to begin with. The LAG function will only be called when FIRM is also missing, meaning that all the LAG function does is pull the old missing value from the queue, and push a new missing value (the current value of FIRM) into the queue. This means that LFIRM will always be missing.

 

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1189 views
  • 3 likes
  • 4 in conversation