turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- Base SAS Programming
- /
- lag function issue

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

06-12-2013 03:51 PM

Hi,

I am trying to calculate forecasts of a variable, I have a value for the variable in the first observation, and then the rest of the rows of data having a missing value in this column - because that is the value i am trying to calculate within my data step. I use the lag function to get previous value when I encounter my first missing value that i need to calculate, it works for the first iteration, and then my lags all return missing value and I can't calculate more than one row of data, and I do not understand why the lag function is returning a missing value when there is clearly data present in the previous row.

I.E:

original data

Flat_rate

0.2434583666

.

.

.

.

.

.

.

goes to:

Flat_rate

0.2434583666

0.2408806974

.

.

.

.

.

.

and the remaining rows still have missing values, the code I am using:

data forecasts;

set mydata;

lag_flat_rate = lag(Flat_Rate);

lag_d_flat_rate = lag(d_Flat_Rate);

put d_Flat_Rate= lag_flat_rate= lag_d_flat_rate= Flat_Rate= _N_=;

if missing(Flat_rate) then do;

d_Flat_Rate=lag_d_flat_rate + /*some arithmetic*/ ;

Flat_Rate = d_Flat_Rate + lag_flat_rate;

end;

run;

Some output from the log:

d_Flat_Rate=-0.031821212 lag_flat_rate=. lag_d_flat_rate=. Flat_Rate=0.2434583666 _N_=1

d_Flat_Rate=. lag_flat_rate=0.2434583666 lag_d_flat_rate=-0.031821212 Flat_Rate=. _N_=2

d_Flat_Rate=. lag_flat_rate=. lag_d_flat_rate=. Flat_Rate=. _N_=3

d_Flat_Rate=. lag_flat_rate=. lag_d_flat_rate=. Flat_Rate=. _N_=4

d_Flat_Rate=. lag_flat_rate=. lag_d_flat_rate=. Flat_Rate=. _N_=5

d_Flat_Rate=. lag_flat_rate=. lag_d_flat_rate=. Flat_Rate=. _N_=6

Any ideas as to why I cannot calculate as I go? How else could I possibly use data that I've calculated in a previous observation for the current calculation?

Thanks!

Accepted Solutions

Solution

06-12-2013
04:16 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to AllSoEasy

06-12-2013 04:16 PM

All good comments about a tricky subject. Here's an even easier way:

data forecasts;

lag_flat_rate = flat_rate;

lag_d_flat_rate = d_flat_rate;

set mydata;

* calculations and PUT statements, but no LAG functions;

run;

As long as you're using SET, not INPUT, all variables coming from the source data set are automatically retained.

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to AllSoEasy

06-12-2013 03:59 PM

typo...

d_Flat_Rate=lag_d_flate_rate + /*some arithmetic*/ ;

lag_d_flat_rate

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

06-12-2013 04:02 PM

Sorry about that, the typo was in my post, not in the code. That isn't the issue

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to AllSoEasy

06-12-2013 04:02 PM

Hi AllSoEasy,

I might be wrong but my understanding of how lag works is that the value for the next iteration is established when you call the lag function in the current iteration. So what is means in practice is that when you do "lag_flat_rate = lag(Flat_Rate);" at your second observation, the current value of "Flat_Rate" is put in the buffer, that is missing. So even if the value of "Flat_Rate" is changed later in the current iteration, that buffer used by the lag function is not updated. Adding a second "lag_flat_rate = lag(Flat_Rate);" after the value has been updated will give you the result you expect.

I hope this helps you.

Yoan

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to yoanbolduc

06-12-2013 04:08 PM

**yoanbolduc- **Thank you for the suggestion, so you mean such as:

data forecasts;

set mydata;

lag_flat_rate = lag(Flat_Rate);

lag_d_flat_rate = lag(d_Flat_Rate);

put d_Flat_Rate= lag_flat_rate= lag_d_flat_rate= Flat_Rate= _N_=;

if missing(Flat_rate) then do;

d_Flat_Rate=lag_d_flat_rate + /*some arithmetic*/ ;

Flat_Rate = d_Flat_Rate + lag_flat_rate;

**lag_flat_rate = lag(Flat_Rate);**

**lag_d_flat_rate = lag(d_Flat_Rate);**

end;

??

-That produced the same result as before

run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to AllSoEasy

06-12-2013 04:51 PM

Thank you yoanbolduc, another helpful suggestion.

But many thanks to Astounding!! Once again you have provided me with a very elegant solution that 100% resolves my issue. Can't tell you how much I appreciate it!

Thanks,

Ryan

Solution

06-12-2013
04:16 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to AllSoEasy

06-12-2013 04:16 PM

All good comments about a tricky subject. Here's an even easier way:

data forecasts;

lag_flat_rate = flat_rate;

lag_d_flat_rate = d_flat_rate;

set mydata;

* calculations and PUT statements, but no LAG functions;

run;

As long as you're using SET, not INPUT, all variables coming from the source data set are automatically retained.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Astounding

06-12-2013 04:25 PM

Hi,

You did it right, but apparently this approach doesn't work in all cases. The test I had done was simpler. I just looked for a different approach but Astounding did beat me with a better solution for your case.

In the case you needed to use the input statement instead of set, here is what I came up with (which is not as elegant as the solution proposed by Astounding).

data work.toto;

input value;

retain lag_value;

if missing(value) then do;

value = lag_value+1;

end;

lag_value = value;

datalines;;;

0

2

4

6

8

.

.

;;;

run;

So basically I reproduced the functionnality of the lag function using a retain statement. This one does work with input but is not very elegant.