Hello,
I am using SAS EG 7.1. I have data that named 'work.have' and it contains 'var' feature for three id. I want to calculate difference of vars for every id. To do this, i used lag() function on 3 different types but its gave different results. (the right of its; "DIFF_1")
Actually i expect same result from all of types. How happened different result, can you explain this?
Best Regards,
Thank you
data work.have;
input subs_id subs_var;
cards;
111111 70
111111 50
111111 100
222222 80
222222 90
333333 60
333333 90
333333 130
333333 80
;
run;
proc sort data = work.have; by subs_id; run;
data work.want;
set work.have;
/* way - 1 */
lag_id = lag(subs_id);
lag_var = lag(subs_var);
if lag_id = subs_id then DIFF_1 = lag_var - subs_var; else DIFF_1 = .;
/* way - 2 */
if lag(subs_id) = subs_id then DIFF_2 = lag(subs_var) - subs_var; else DIFF_2 = .;
/* way - 3 */
by subs_id;
if not first.subs_id then DIFF_3 = lag(subs_var) - subs_var; else DIFF_3 = .;
run;
Hi @maozsoy
Here is an explanation from the documentation about the LAG () function:
Note: Storing values at the bottom of the queue and returning values from the top of the queue occurs only when the function is executed. An occurrence of the LAGn function that is executed conditionally stores and return values only from the observations for which the condition is satisfied.
So, the best practice is to create a lagged value in an assignment statement before using it in a conditional statement. Otherwise, it can lead to unexpected results.
-> So WAY 1 would be the proper approach in your case:
data work.want;
set work.have;
/* way - 1 */
lag_id=lag(subs_id);
lag_var=lag(subs_var);
if lag_id=subs_id then DIFF_1=lag_var - subs_var;
else DIFF_1=.;
run;
Best,
@ed_sas_member thank you for your reply. OK i will prefer way that your mentioned but i didn't understand working logic.
For example 7th row, "if statement" is true (subs_id = 333333 and 6th row upper subs_id = 333333) but lag function doesn't work right and i guess it stores 5th instead of 6th row
Hi @maozsoy
You're welcome.
Regarding the 7th row, I think you're right because the 6th row corresponds to a "first. " so the statement is not true.
@maozsoy wrote:
@ed_sas_member thank you for your reply. OK i will prefer way that your mentioned but i didn't understand working logic.
For example 7th row, "if statement" is true (subs_id = 333333 and 6th row upper subs_id = 333333) but lag function doesn't work right and i guess it stores 5th instead of 6th row
Because it retrieved the value stored on the 5th observation. The LAG() function call on the 7th observation returns the value from the 5th observation because that last time the LAG() function ran was on the 5th observation. On the 6th observation you never ran the LAG() function so there was no way for the value to get stored.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.