## lag function

Solved
Occasional Contributor
Posts: 14

# lag function

Hi

I have a test data with missing values and I want to fill the table by taking previous month's value.

Test data

 month A B C Jan 1 2 3.3 Feb 4 . 56 Mar 3 . . Apr 5 3 4 May . 6 2

data I want to have

 month A B C Jan 1 2 3.3 Feb 4 2 56 Mar 3 2 56 Apr 5 3 4 May 5 6 2

Here's the code I wrote, but it gives me the exact original table with missing data and I do not get an error message in the log.

I wonder why the lagged values are not applied according to my if statement logic..

I appreciate if someone could help me shed a light on this.

data test;

input month \$ A B C;

datalines;

Jan 1 2 3.3

Feb 4 . 56

Mar 3 . .

Apr 5 3 4

May . 6 2

;

run;

data no_missing(drop=i);

set test;

array ticker(*)A--C;

do i=1 to dim(ticker);

if ticker{i}=. then ticker{i}=lag(ticker{i});

else ticker{i}=ticker{i};

end;

run;

Accepted Solutions
Solution
‎02-27-2015 11:08 AM
Posts: 3,852

## Re: lag function

LAG it not the proper tool for LOCF.  I like the UPDATE trick for your data.

data abc;
input month \$ A  B  C;
cards;
Jan   1  2  3.3
Feb   4  .  56
Mar   3  .  .
Apr   5  3  4
May   .  6  2
;;;;
run;
data abcV / view=abcv;
retain dummy 1;

set abc;
run;
data locf;
update abcv(obs=0) abcv;
by dummy;
output;

drop dummy;
run;
proc print;

run;

All Replies
Occasional Contributor
Posts: 14

## Re: lag function

I just recalled lag function in a conditional statement always returns missing value...

That's why I was getting the identical original table after running code..

but I appreciate if anyone can let me know how to code this without using lag function..

Solution
‎02-27-2015 11:08 AM
Posts: 3,852

## Re: lag function

LAG it not the proper tool for LOCF.  I like the UPDATE trick for your data.

data abc;
input month \$ A  B  C;
cards;
Jan   1  2  3.3
Feb   4  .  56
Mar   3  .  .
Apr   5  3  4
May   .  6  2
;;;;
run;
data abcV / view=abcv;
retain dummy 1;

set abc;
run;
data locf;
update abcv(obs=0) abcv;
by dummy;
output;

drop dummy;
run;
proc print;

run;

Occasional Contributor
Posts: 14

## Re: lag function

Hi data_null,

Thank you so much for your quick response!

I came up with my revised code, but this only works if I know how many consecutive rows will be missing in advance.

Your code is much better!! Thanks again!

data no_missing(drop=i);

set test;

array ticker(*)A--C;

do i=1 to dim(ticker);

lag=lag(ticker{i});

lag=lag2(ticker{i});

if ticker{i}=. then ticker{i}=coalesce(lag,lag2);

else if ticker{i}=ticker{i};

end;

run;

Posts: 3,167

## Re: lag function

Like the VIEW! Learn some!

I used to hate adding another dummy variable in this case, but not anymore with the view. Thanks for sharing, John.

Posts: 3,167

## Re: lag function

Quote: "I just recalled lag function in a conditional statement always returns missing value.."

Wrong. The LAG() will return next value in the queue, missing or non-missing, conditional or non-conditional. Please read the manual.

Just like you, I like the options from data _null_ very much. Here is another alternative that maybe closer to what you are familiar with:

data test;

input month \$ A B C;

datalines;

Jan 1 2 3.3

Feb 4 . 56

Mar 3 . .

Apr 5 3 4

May . 6 2

;

run;

data want;

set test;

array ticker(3) a--c;

array _ticker(3) _temporary_;

do i=1 to 3;

_ticker(i)=coalesce(ticker(i), _ticker(i));

ticker(i)=_ticker(i);

end;

drop i;

run;

Occasional Contributor
Posts: 14

## Re: lag function

Hi Hai.Kuo,

Thanks for your response and pointing this out!

In below doc, I read "Since LAG is called in conditional code, the value it returned is not the value from the previous observation, but is the value of its previous execution."

I also read "However, when it is executed conditionally, the LAG function only retrieves values from observations for which the condition is satisfied."

If you have any other good article on this topic, please let me know.

http://support.sas.com/resources/papers/proceedings09/055-2009.pdf

Posts: 3,852

## Re: lag function

I think both of those statement are saying essentially the same thing. I think it is best to leave the word observation(s) out of the conversation all together.

The LAG function retrieves the value from the last time it was executed.  Or in the case of LAGn the nth time it was executed.

Posts: 3,167

## Re: lag function

I know it is confusing, and trust me, I have been there and stuck there for a not-short amount of time. SAS could do better to help users on this matter by 1. Changing the function name from Lag() to Queue() or something.

2. Doing better explanation job in their Docs, at least for the perspective of new users.

I completely agree what Data _null_ has just said in his post, here I will give you a short example in a hope to help you understand this better:

Here is the code, very simple, only focusing on the situation of LAG1, while LAGn() is pretty much the same concept.

data lag_test;

set sashelp.class(obs=6 keep=Name);

Uncon_queue=lag(name);

if mod(_n_,2)=0 then

Con_queue=lag(name);

run;

Here is the results:

Explanation:

First, I want you to picture a queue, where LAG() Push value  IN and Pull value OUT, I don't know where it sits, but I suspect it is somewhere in the memory, and its size will be limited by that factor.

Second, remember that when LAG() executed, it will Pull out the last (most recent) in the queue , and Push in another one (the one in PDV) .(if LAG2(), it will pull out the Second Last in the queue, if LAG3(), it will pull the Third Last in the queue, so on so forth. But in term of Push-in, they all behave the same.)

For the Uncon_queue,

_n_=1, LAG() executed, since this is the first time it executed, there is no values in the queue. So Pull_out(output)=MISSING, Push_in=Alfred;

_n_=2, LAG() executed, Pull_out(output)=Alfred, Push_in=Alice

_n_=3, LAG() executed, Pull_out(output)=Alice, Push_in=Barbara

_n_=4, so on so forth.

So the content of this queue is: everyone from Alfred to James, it is consistent with a picture of 'lagging' observations.

For the Con_queue, very SAME thing happens, keep in mind LAG() only executed when _n_ is a even number.

_n_=1, LAG() NOT executed, so output=missing

_n_=2, LAG() executed, since this is the first time it executed, there is no values in the queue. So Pull_out(output)=MISSING, Push_in=Alice, the current on in PDV.

_n_=3, LAG() NOT executed, so output=missing

_n_=4, LAG() executed, Pull_out(output)=Alice, the last in Queue, Push_in=Carol, the current one in PDV

_n_=5, LAG() NOT executed, so output=missing

_n_=6, LAG() executed, Pull_out(output)=Carol, the last in Queue, Push_in=James, the current one in PDV

So the content of this queue is: Alice, Carol, James, where their _n_ is a even number, Obviously it is NOT consistent with picture of 'lagging' observations.

But if you take this "Queue" concept into play, LAG() is predictable and consistent, like I said before, whenever it executes, it dumps in the current one in the PDV, at the same time, pull the last one in the queue out, missing or non-missing, conditional or non-conditional.

In the real life, to make new user life easier, they are often taught not to use LAG() conditionally, because unconditionally using LAG() will be consistent with the layman concept of LAG, that is all of the obs will be put into the queue, one after another, nicely. And fortunately, most of the scenarios does not require to LAG under any condition.

But if you do have to use it conditionally, picture it as being pulling the last one from a queue, which is entirely different from pulling the previous observation.

HTH,

Haikuo

Occasional Contributor
Posts: 14

## Re: lag function

Hi Hai.Kuo and data_null_,

Thank you so much for your response!

I am at awe how helpful and knowledgeable you are and I love SAS community for this precise reason!!

I hope you are having great weekend and thanks again!!

🔒 This topic is solved and locked.