In the code below, it seems to me that the first argument in the second IF statement (if valid_ct ge 1) will never be true and so it will never be processed. And yet it seems that it is being processed. Can someone explain why?
DATA c (drop=encdate dob);
set b;
by edipn encdate2;
retain valid_ct;
Days = encdate2 - dob2;
dose+1;
if first.edipn then
do; dose=1; valid_ct=0;
end;
valid=0;
if valid_ct ge 1 and ( encdate2 - lag1(encdate2)) ge 24 then
do; valid=1; valid_ct = valid_ct+1;
end;
else
if ( dose=1 or valid_ct=0 ) and (encdate2 - dob2 ) ge 360 then
do; valid=1; valid_ct = valid_ct+1 ;
end;
What leads you to believe the test will never be true?
The variable valid_ct has the RETAINED across iterations of the data step. So if the second part of that comparison is true enough times within each value of EDIPN then valid_ct gets set to 2.
If EDIPN isn't repeated at least 3 times I can see valid_ct not exceeding 1.
Thanks for the reply!
I guess I don't understand what' going on with the Retain and first.edipn
I was thinking that if first.edipn is true then valid_ct gets set to 0 else it is empty. What does valid_ct get set to when first.edipn is false?
RETAIN is used to retain the value of variables when reading new records from a data set. Note: if you use a variable in the data set you'll like not see the results you expect.
FIRST and LAST a special predicates that look at the variables by sort order. If the data isn't sorted by those variables you'll get an error unless you use NOTSORTED (be very careful with this). If First.edipn instructs SAS to do the the following operations for each the first of the edipn values occurring. Note the DO END construct assigns initial values to two variables.
If the record is not the first one for that edipn, then the program goes to the next instruction. You can use FIRST. and LAST. for any of the variables on the BY statement.
If you have something you want done only on the last value of edipn you could use If last.edipn then do; <something>; If there is only one record for a value of edipn then it is both first and last.
Note that the LAG1(encdate2) will be pointing to the value of the previous edipn for each First.edipn value except the first record where it is missing. So you might look into that bit of logic (and not(first.edipn) perhaps).
The DOSE + 1; line has an Implied RETAIN so that it counts each record within the edipn.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.