Hi @Sanflo Well done, The change you did was neat and correct.
Basically, The code is pretty straight forward as your sample is sorted by ascending date, uses two passes(executing a by group twice hence you see do until*twice).
On the first pass, we determine the first date for A and the first date for D. We hold these two dates in temporary variables that we would use for conditional check and output on the 2nd pass of the same by group and eventually drop them at the end.
The N(of _t: ) determines the number of non missing values as if one were to be missing, we are either missing A or D. Since we need both, it has be 2, so that's a check in place.
Then it's pretty much the simple between-and filter that does our need.
Now for your requirement's minor change(up-to to D but excluding D), the code further simplifies to:
data want;
do until(last.uid);
set have;
by uid STATUS notsorted;
if status='A' and not _t1 then _t1=date;
else if not _t2 and status='D' then _t2=date;
end;
do until(last.uid);
set have;
by uid STATUS notsorted;
if n(of _t:)=2 and _t1<=date<_t2 then output;
end;
drop _:;
run;
for the reason that we are not concerned about the D date anymore which may have another status like A falling on the same date. Nonetheless, please do test thoroughly and feel free to come back to us if you need more help.
HTH & Regards!
@Sanflo wrote:
Hi @novinosrin,
First, thanks for the kind words and answers to your questions from prior:
1. Yes there can be multiples of D for a by group. We want to stop when it finds the first instance of this (at least just now).
2. It would be the earliest A to the earliest D.
Next thank you so much for your solution provided. It worked perfectly in with both the sample data and also when I modified it to my live data. If you don't mind may I ask you another question?
- I realised once I ran it that my original requirement that keeping all records up to and including "D" isn't quite correct. I should find all records up to "D" but then not including "D" of the by group. I updated the below line of your code to change the 2nd "<=" to a "<". After testing this everything looks perfect in the data. Would you be able to confirm that this would be correct and perhaps explain how the code works a little?
Again much appreciated for the code and your time!
Sandy
IF N(OF _T:)=2 AND _T1<=DATEUPDATED<_T2 and MISSING(_S) THEN OUTPUT;
... View more