Hi everybody
there is two code which are acting differently. can anybody explain why?
Hint: Lag function puls from queue!!
According to me they must procedure the same result!!!
data test;
INFILE datalines DLM=',' DSD;
input a b c ;
datalines;
4272451,17878,17878
4272451,17878,17878
4272451,17887,17887
4272454,17878,17878
4272454,17881,17881
4272454,17893,17893
4272455,17878,17878
4272455,17878,18200
run;
DATA TEST1;
RETAIN F ( 1) ;
laga = lag(a); lagb=lag(b);
SET TEST;
IF A^=laga OR laga =. THEN do; f=1;end; ELSE IF A=laga AND b>lagb THEN do; f=f+1 ; end;
RUN;
proc print data=test1;
run;
DATA TEST2;
RETAIN F ( 1) ;
SET TEST;
IF A^=LAG(A) OR LAG(A)=. THEN do; f=1;end; ELSE IF A=LAG(A) AND b>LAG(B) THEN do; f=f+1 ; end;
RUN;
proc print data=test2;
run;
The problem is the ELSE clause. This can cause the LAG(A) and LAG(B) functions in the second IF statement not to run on every iteration. So instead of comparing the current value to the immediately preceding value you are comparing it to some value more than one observation earlier.
The problem is the ELSE clause. This can cause the LAG(A) and LAG(B) functions in the second IF statement not to run on every iteration. So instead of comparing the current value to the immediately preceding value you are comparing it to some value more than one observation earlier.
Thanks Tom, you are right..
Actually i have one more question. Why the lag function in the if and else part acting diffferently? they are on the same input line?
An if statement functions from left to right and stops as soon as its condition is met
in the else part of the if satatement, lag fonctions will pull the value from previous, to determine the previous what value it will check? row number_ obs? _n_? which one?
thanks
It functions with a que, using the last value set. However, the last value is only put into the cue when the statement is called.
thanks for the asnwer, but i still dont get it.. if and else part use same que ? or not? if they use same que then the lag function prodce same result for both part of if staatement.. if they dont use the same que then why? where the que for the else part? how can i reach it?
You also have trouble with the location of your LAG function calls in the first case. You have placed if BEFORE the SET statement.
This pushes an extra set of missing values onto the stack so that you end up getting the value from two observations before the call.
516 data test2;
517 before=lag(a);
518 set test;
519 after =lag(a);
520 put (a before after) (=);
521 run;
a=1 before=. after=.
a=2 before=. after=1
a=3 before=1 after=2
a=4 before=2 after=3
a=5 before=3 after=4
a=6 before=4 after=5
a=7 before=5 after=6
this is very suprising for me!! why there is a extra missing step i couldnot understand.. but thanks i will think about it..
Try putting PUT statements before and after the SET statement to see when the values of the variables from the input dataset change.
hi ... you can have a LAG function within a conditional statement that does execute in every iteration of the data step if you use an IFN function instead of your IF-THEN-ELSE statements...
data test3;
retain f 1;
set test;
f = ifn(a ne lag(a) or missing(lag(a)) , 1 , ifn(a eq lag(a) and b gt lag(b) , f+1 , f ));
run;
the above gives the same result as one of your data step, just modified slightly by moving the LAG statements after the SET statement ...
DATA TEST1;
RETAIN F ( 1) ;
SET TEST;
laga = lag(a); lagb=lag(b);
IF A^=laga OR laga =. THEN do; f=1;end; ELSE IF A=laga AND b>lagb THEN do; f=f+1 ; end;
RUN;
also fyi ... you don't need the DO-END stuff or the parentheses in the RETAIN and you can use a SUM statement ...
DATA TEST1;
RETAIN F 1 ;
SET TEST;
laga = lag(a);
lagb=lag(b);
IF A^=laga OR laga =. THEN f=1;
ELSE
IF A=laga AND b>lagb THEN f+1;
RUN;
and ... since in the first pass through the data step, laga is missing, you can skip the RETAIN statement that gives F an intial value of 1 since F will be assigned a 1 in the first pass and the sum statement (F+1) is an "implied RETAIN" for F
ps you can read more about LAG and IF in Howard Schreier's paper ... "Conditional Lags Don't Have to be Treacherous"
http://www.howles.com/saspapers/CC33.pdf
Thanks for the answer Mike, it was helpful..
But i still searching lag function how determine the previous? if it gets from que why if and else part use different que?
hi ... it's not that a different part of the queue is used
LAG gives you the value of a variable from the last time the LAG function was executed, NOT the value of the variable in the previous observation (that's what Tom told you)
so ... if you use IF-THEN-ELSE without the statements ...
LAGA = LAG(A);
LAGB = LAG(B);
then LAG(A) and/or LAG(B) may not get executed during each pass through the data step
OK .. i understand the last time execution.. but then i have to ask last time execution of if part is different from last time execution of else part? since according to to Tom else part is not executed in every iteration? but then there must be a log or something else that with help of it lag knows from where it will continue? isnt it?
Lets create a scnenario: there is data and if then-else statement and lag function in each part:
Observation 1: if part true and lag in the if part is executed. lag in the else part is not executed
Observation 2: if part true and lag in the if part is executed. lag in the else part is not executed
Observation 3: if part false and lag in the if part is not executed. lag in the else part is executed
what is the value of lag in the third observation?
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.