DATA Step, Macro, Functions and more

lag function issue

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 105
Accepted Solution

lag function issue

Hi,

I am trying to calculate forecasts of a variable, I have a value for the variable in the first observation, and then the rest of the rows of data having a missing value in this column - because that is the value i am trying to calculate within my data step. I use the lag function to get previous value when I encounter my first missing value that i need to calculate, it works for the first iteration, and then my lags all return missing value and I can't calculate more than one row of data, and I do not understand why the lag function is returning a missing value when there is clearly data present in the previous row.

I.E:

original data

Flat_rate

0.2434583666

.

.

.

.

.

.

.

goes to:

Flat_rate

0.2434583666

0.2408806974

.

.

.

.

.

.

and the remaining rows still have missing values, the code I am using:

data forecasts;

set mydata;

lag_flat_rate = lag(Flat_Rate);

  lag_d_flat_rate = lag(d_Flat_Rate);

put d_Flat_Rate= lag_flat_rate= lag_d_flat_rate= Flat_Rate= _N_=;

  if missing(Flat_rate) then do;

d_Flat_Rate=lag_d_flat_rate + /*some arithmetic*/ ;

Flat_Rate = d_Flat_Rate + lag_flat_rate;

end;

run;

Some output from the log:

d_Flat_Rate=-0.031821212 lag_flat_rate=. lag_d_flat_rate=. Flat_Rate=0.2434583666 _N_=1

d_Flat_Rate=. lag_flat_rate=0.2434583666 lag_d_flat_rate=-0.031821212 Flat_Rate=. _N_=2

d_Flat_Rate=. lag_flat_rate=. lag_d_flat_rate=. Flat_Rate=. _N_=3

d_Flat_Rate=. lag_flat_rate=. lag_d_flat_rate=. Flat_Rate=. _N_=4

d_Flat_Rate=. lag_flat_rate=. lag_d_flat_rate=. Flat_Rate=. _N_=5

d_Flat_Rate=. lag_flat_rate=. lag_d_flat_rate=. Flat_Rate=. _N_=6

Any ideas as to why I cannot calculate as I go? How else could I possibly use data that I've calculated in a previous observation for the current calculation?

Thanks!


Accepted Solutions
Solution
‎06-12-2013 04:16 PM
Super User
Posts: 5,081

Re: lag function issue

All good comments about a tricky subject.  Here's an even easier way:

data forecasts;

   lag_flat_rate = flat_rate;

   lag_d_flat_rate = d_flat_rate;

   set mydata;

   * calculations and PUT statements, but no LAG functions;

run;

As long as you're using SET, not INPUT, all variables coming from the source data set are automatically retained.


View solution in original post


All Replies
Super User
Posts: 17,819

Re: lag function issue

typo...

d_Flat_Rate=lag_d_flate_rate + /*some arithmetic*/ ;

                    lag_d_flat_rate

Frequent Contributor
Posts: 105

Re: lag function issue

Sorry about that, the typo was in my post, not in the code. That isn't the issue

Occasional Contributor
Posts: 11

Re: lag function issue

Hi AllSoEasy,

     I might be wrong but my understanding of how lag works is that the value for the next iteration is established when you call the lag function in the current iteration. So what is means in practice is that when you do "lag_flat_rate = lag(Flat_Rate);" at your second observation, the current value of "Flat_Rate" is put in the buffer, that is missing. So even if the value of "Flat_Rate" is changed later in the current iteration, that buffer used by the lag function is not updated. Adding a second "lag_flat_rate = lag(Flat_Rate);" after the value has been updated will give you the result you expect.

I hope this helps you.

Yoan

Frequent Contributor
Posts: 105

Re: lag function issue

yoanbolduc- Thank you for the suggestion, so you mean such as:

data forecasts;

set mydata;

lag_flat_rate = lag(Flat_Rate);

lag_d_flat_rate = lag(d_Flat_Rate);

put d_Flat_Rate= lag_flat_rate= lag_d_flat_rate= Flat_Rate= _N_=;

  if missing(Flat_rate) then do;

d_Flat_Rate=lag_d_flat_rate + /*some arithmetic*/ ;

Flat_Rate = d_Flat_Rate + lag_flat_rate;

lag_flat_rate = lag(Flat_Rate);

lag_d_flat_rate = lag(d_Flat_Rate);

end;

??

-That produced the same result as before

run;

Frequent Contributor
Posts: 105

Re: lag function issue

Thank you yoanbolduc, another helpful suggestion.

But many thanks to Astounding!! Once again you have provided me with a very elegant solution that 100% resolves my issue. Can't tell you how much I appreciate it!

Thanks,

Ryan

Solution
‎06-12-2013 04:16 PM
Super User
Posts: 5,081

Re: lag function issue

All good comments about a tricky subject.  Here's an even easier way:

data forecasts;

   lag_flat_rate = flat_rate;

   lag_d_flat_rate = d_flat_rate;

   set mydata;

   * calculations and PUT statements, but no LAG functions;

run;

As long as you're using SET, not INPUT, all variables coming from the source data set are automatically retained.


Occasional Contributor
Posts: 11

Re: lag function issue

Hi,

     You did it right, but apparently this approach doesn't work in all cases. The test I had done was simpler. I just looked for a different approach but Astounding did beat me with a better solution for your case.

     In the case you needed to use the input statement instead of set, here is what I came up with (which is not as elegant as the solution proposed by Astounding).

data work.toto;

  input value;

  retain lag_value;

  if missing(value) then do;

  value = lag_value+1;

  end;

  lag_value = value;

datalines;;;

0

2

4

6

8

.

.

;;;

run;

So basically I reproduced the functionnality of the lag function using a retain statement. This one does work with input but is not very elegant.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 254 views
  • 4 likes
  • 4 in conversation