BookmarkSubscribeRSS Feed
maozsoy
Fluorite | Level 6

Hello,

 

I am using SAS EG 7.1. I have data that named 'work.have' and it contains 'var' feature for three id. I want to calculate difference of vars for every id. To do this, i used lag() function on 3 different types but its gave different results. (the right of its; "DIFF_1")

 

Actually i expect same result from all of types. How happened different result, can you explain this?

 

Best Regards,

Thank you

 

data work.have;
input subs_id subs_var;
cards;
111111  70
111111  50
111111  100
222222  80
222222  90
333333  60
333333  90
333333  130
333333  80
;
run;

proc sort data = work.have; by subs_id; run;

data work.want;
set  work.have;

/*	way - 1	*/
lag_id = lag(subs_id);
lag_var = lag(subs_var);
if lag_id = subs_id then DIFF_1 = lag_var - subs_var; else DIFF_1 = .;

/*	way - 2	*/
if lag(subs_id) = subs_id then DIFF_2 = lag(subs_var) - subs_var; else DIFF_2 = .;

/*	way - 3	*/
by subs_id;
if not first.subs_id then DIFF_3 = lag(subs_var) - subs_var; else DIFF_3 = .;

run;

 

lag.PNG

4 REPLIES 4
ed_sas_member
Meteorite | Level 14

Hi @maozsoy 

 

Here is an explanation from the documentation about the LAG () function:

Note: Storing values at the bottom of the queue and returning values from the top of the queue occurs only when the function is executed. An occurrence of the LAGn function that is executed conditionally stores and return values only from the observations for which the condition is satisfied.

So, the best practice is to create a lagged value in an assignment statement before using it in a conditional statement. Otherwise, it can lead to unexpected results.

 

-> So WAY 1 would be the proper approach in your case:

data work.want;
	set work.have;

	/*	way - 1	*/
	lag_id=lag(subs_id);
	lag_var=lag(subs_var);

	if lag_id=subs_id then DIFF_1=lag_var - subs_var;
	else DIFF_1=.;
run;

Best,

 

maozsoy
Fluorite | Level 6

@ed_sas_member thank you for your reply. OK i will prefer way that your mentioned but i didn't understand working logic. 

 

For example 7th row, "if statement" is true (subs_id = 333333 and 6th row upper subs_id = 333333) but lag function doesn't work right and i guess it stores 5th instead of 6th row

ed_sas_member
Meteorite | Level 14

Hi @maozsoy 

 

You're welcome.

Regarding the 7th row, I think you're right because the 6th row corresponds to a "first. " so the statement is not true.

Tom
Super User Tom
Super User

@maozsoy wrote:

@ed_sas_member thank you for your reply. OK i will prefer way that your mentioned but i didn't understand working logic. 

 

For example 7th row, "if statement" is true (subs_id = 333333 and 6th row upper subs_id = 333333) but lag function doesn't work right and i guess it stores 5th instead of 6th row


Because it retrieved the value stored on the 5th observation.  The LAG() function call on the 7th observation returns the value from the 5th observation because that last time the LAG() function ran was on the 5th observation. On the 6th observation you never ran the LAG() function so there was no way for the value to get stored.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 603 views
  • 0 likes
  • 3 in conversation