BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
AllSoEasy
Obsidian | Level 7

Hi,

I am trying to calculate forecasts of a variable, I have a value for the variable in the first observation, and then the rest of the rows of data having a missing value in this column - because that is the value i am trying to calculate within my data step. I use the lag function to get previous value when I encounter my first missing value that i need to calculate, it works for the first iteration, and then my lags all return missing value and I can't calculate more than one row of data, and I do not understand why the lag function is returning a missing value when there is clearly data present in the previous row.

I.E:

original data

Flat_rate

0.2434583666

.

.

.

.

.

.

.

goes to:

Flat_rate

0.2434583666

0.2408806974

.

.

.

.

.

.

and the remaining rows still have missing values, the code I am using:

data forecasts;

set mydata;

lag_flat_rate = lag(Flat_Rate);

  lag_d_flat_rate = lag(d_Flat_Rate);

put d_Flat_Rate= lag_flat_rate= lag_d_flat_rate= Flat_Rate= _N_=;

  if missing(Flat_rate) then do;

d_Flat_Rate=lag_d_flat_rate + /*some arithmetic*/ ;

Flat_Rate = d_Flat_Rate + lag_flat_rate;

end;

run;

Some output from the log:

d_Flat_Rate=-0.031821212 lag_flat_rate=. lag_d_flat_rate=. Flat_Rate=0.2434583666 _N_=1

d_Flat_Rate=. lag_flat_rate=0.2434583666 lag_d_flat_rate=-0.031821212 Flat_Rate=. _N_=2

d_Flat_Rate=. lag_flat_rate=. lag_d_flat_rate=. Flat_Rate=. _N_=3

d_Flat_Rate=. lag_flat_rate=. lag_d_flat_rate=. Flat_Rate=. _N_=4

d_Flat_Rate=. lag_flat_rate=. lag_d_flat_rate=. Flat_Rate=. _N_=5

d_Flat_Rate=. lag_flat_rate=. lag_d_flat_rate=. Flat_Rate=. _N_=6

Any ideas as to why I cannot calculate as I go? How else could I possibly use data that I've calculated in a previous observation for the current calculation?

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

All good comments about a tricky subject.  Here's an even easier way:

data forecasts;

   lag_flat_rate = flat_rate;

   lag_d_flat_rate = d_flat_rate;

   set mydata;

   * calculations and PUT statements, but no LAG functions;

run;

As long as you're using SET, not INPUT, all variables coming from the source data set are automatically retained.


View solution in original post

7 REPLIES 7
Reeza
Super User

typo...

d_Flat_Rate=lag_d_flate_rate + /*some arithmetic*/ ;

                    lag_d_flat_rate

AllSoEasy
Obsidian | Level 7

Sorry about that, the typo was in my post, not in the code. That isn't the issue

yoanbolduc
Calcite | Level 5

Hi AllSoEasy,

     I might be wrong but my understanding of how lag works is that the value for the next iteration is established when you call the lag function in the current iteration. So what is means in practice is that when you do "lag_flat_rate = lag(Flat_Rate);" at your second observation, the current value of "Flat_Rate" is put in the buffer, that is missing. So even if the value of "Flat_Rate" is changed later in the current iteration, that buffer used by the lag function is not updated. Adding a second "lag_flat_rate = lag(Flat_Rate);" after the value has been updated will give you the result you expect.

I hope this helps you.

Yoan

AllSoEasy
Obsidian | Level 7

yoanbolduc- Thank you for the suggestion, so you mean such as:

data forecasts;

set mydata;

lag_flat_rate = lag(Flat_Rate);

lag_d_flat_rate = lag(d_Flat_Rate);

put d_Flat_Rate= lag_flat_rate= lag_d_flat_rate= Flat_Rate= _N_=;

  if missing(Flat_rate) then do;

d_Flat_Rate=lag_d_flat_rate + /*some arithmetic*/ ;

Flat_Rate = d_Flat_Rate + lag_flat_rate;

lag_flat_rate = lag(Flat_Rate);

lag_d_flat_rate = lag(d_Flat_Rate);

end;

??

-That produced the same result as before

run;

AllSoEasy
Obsidian | Level 7

Thank you yoanbolduc, another helpful suggestion.

But many thanks to Astounding!! Once again you have provided me with a very elegant solution that 100% resolves my issue. Can't tell you how much I appreciate it!

Thanks,

Ryan

Astounding
PROC Star

All good comments about a tricky subject.  Here's an even easier way:

data forecasts;

   lag_flat_rate = flat_rate;

   lag_d_flat_rate = d_flat_rate;

   set mydata;

   * calculations and PUT statements, but no LAG functions;

run;

As long as you're using SET, not INPUT, all variables coming from the source data set are automatically retained.


yoanbolduc
Calcite | Level 5

Hi,

     You did it right, but apparently this approach doesn't work in all cases. The test I had done was simpler. I just looked for a different approach but Astounding did beat me with a better solution for your case.

     In the case you needed to use the input statement instead of set, here is what I came up with (which is not as elegant as the solution proposed by Astounding).

data work.toto;

  input value;

  retain lag_value;

  if missing(value) then do;

  value = lag_value+1;

  end;

  lag_value = value;

datalines;;;

0

2

4

6

8

.

.

;;;

run;

So basically I reproduced the functionnality of the lag function using a retain statement. This one does work with input but is not very elegant.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1125 views
  • 4 likes
  • 4 in conversation