Obsidian | Level 7

## Use lag function without storing lag function output as a variable

Col-A Col-B

X        9

X        4

X        3

Y        .

X        1

I want to add the last values of Col-B when Col-A is equal to "Y" and replace the null value by it

Output ->

Col-A Col-B

X        9

X        4

X        3

Y        7

X        1

To achieve this output I am using this code->

data table;
set table;
if "Col-A"n = "Y"
then "Col-B"n=lag1("Col-B"n)+lag2("Col-B"n);
run;
But I am getting the following error in this code ->
NOTE: Missing values were generated as a result of performing an operation on missing values.

Note- I am getting proper output when I am storing the lag function variables first and then adding them,

But I don't want to unnecessarily store variables.

9 REPLIES 9
Tourmaline | Level 20

## Re: Use lag function without storing lag function output as a variable

Try this

``````data have;
input a \$ b;
datalines;
X 9
X 4
X 3
Y .
X 7
;

data want;
set have;
b = ifn(b = ., sum(lag1(b), lag2(b)), b);
run;``````
Obsidian | Level 7

## Re: Use lag function without storing lag function output as a variable

Your code worked perfectly but can you please explain to me what is wrong with the code that I am using?

Garnet | Level 18

## Re: Use lag function without storing lag function output as a variable

@Saurabh_Rana wrote:

Your code worked perfectly but can you please explain to me what is wrong with the code that I am using?

Suppose x value is missing and y=5.
result of x+y will be missing value, while sum(of x, y) = 5

that is because sum function ignores missing value or relate to it as zero.

Obsidian | Level 7

## Re: Use lag function without storing lag function output as a variable

Yeah but there are no null values, in this case, it's simply 3+4=7.
Tourmaline | Level 20

## Re: Use lag function without storing lag function output as a variable

Sure. The Lag Function is not a lookback function. Rather, it is a queue function. When you call Lag2(b) you set up a queue with two entries. THe values in the entries are initially missing. Each time the Lag Function is called, the top value is pushed out and returned from the queue. The remaining values are shifted upward, and the new value of the argument is placed at the bottom of the queue.

That is why you will almost certainly get undesired results when calling the Lag Function in conditional logic such as If-Then Statements. However, when we use the Ifn Function, the Lag Function is executed regardless of whether the expression in the first argument is true or not.

Hope this clears things up 🙂

Obsidian | Level 7

## Re: Use lag function without storing lag function output as a variable

Thank you for your in-depth explanation but I am still not very clear about why it doesn't work with IF-Then on the other works perfectly with Ifn function.
I guess my basic understanding of base sas is not very clear.
Tourmaline | Level 20

## Re: Use lag function without storing lag function output as a variable

The difference is:

With the If-Then Statements, the lag function executes only when the condition is true.

The Ifn Function is designed so that the lag function executes no matter if the condition is true or not.

Obsidian | Level 7

## Re: Use lag function without storing lag function output as a variable

Yes exactly, I want the lag function to be executed just for the case where Col-A is equal to "Y",

Like this -

A B lag1(B) lag2(B)
X 4
X 3
Y . 3 4
Tourmaline | Level 20

## Re: Use lag function without storing lag function output as a variable

Then this is the code for you.

Makes sense?

``````data have;
input a \$ b;
datalines;
X 9
X 4
X 3
Y .
X 7
;

data want;
set have;
b = ifn(a = "Y", sum(lag1(b), lag2(b)), b);
run;``````
Discussion stats
• 9 replies
• 793 views
• 2 likes
• 3 in conversation