Hi,
I have an extract like this with multiple bans and records. It is sorted by ban and ent_seq_no. What i want to do is only keep records for a ban after the DF step only (including DF). So in this case the first 3 records would be deleted...
I am thinking that deleting the records where ent_seq_no is less than the seq no for the record that has code DF is the way but i am not sure how to write it...
BAN | ENT_SEQ_NO | COL_ACTV_CODE | COL_ACTV_DATE |
10987 | 941020677 | D2 | 04JUL2019:00:00:00 |
10987 | 941953583 | D3 | 10JUL2019:00:00:00 |
10987 | 942384352 | 3S | 12JUL2019:00:00:00 |
10987 | 942902222 | DF | 16JUL2019:00:00:00 |
10987 | 943965758 | E | 23JUL2019:00:00:00 |
10987 | 944352184 | 1S | 25JUL2019:00:00:00 |
Retain a flag variable:
data want;
set have;
by ban;
retain keepflag;
if first.ban then keepflag = 0;
if col_actv_code = 'DF' then keepflag = 1;
if keepflag;
drop keepflag;
run;
Retain a flag variable:
data want;
set have;
by ban;
retain keepflag;
if first.ban then keepflag = 0;
if col_actv_code = 'DF' then keepflag = 1;
if keepflag;
drop keepflag;
run;
It worked! I was trying this to no avail
DATA TEST1;
SET TEST;
count +1;
do until (first.ban);
BY BAN;
IF COL_ACTV_CODE = 'DF' THEN COUNT= 0;
end;
RUN;
This may get you started.
data want; set have; by ban ent_seq_no; retain flag; if first.ban then flag=0; if flag=0 and col_actv_code='DF' then flag=1; /* this keeps records after the flag is set*/ if flag=1; run;
You may want to drop the Flag variable after verifying that this works.
The BY sets special variable First. and Last. that you can check to do something for the group, in this case reset a flag. The flag should only be reset to 1 when 1) it is currently zero and 2) the value of DF is encountered. The Flag will stay at 1 until the next BAN group is encountered.
The last IF indicates which values get written to the output set.
thank you so much. your response is the same as above. The explanation helps. I think i have a much better understanding of how to work problems like this now.
data have;
input BAN ENT_SEQ_NO COL_ACTV_CODE $ COL_ACTV_DATE :datetime20.;
format COL_ACTV_DATE datetime20.;
cards;
10987 941020677 D2 04JUL2019:00:00:00
10987 941953583 D3 10JUL2019:00:00:00
10987 942384352 3S 12JUL2019:00:00:00
10987 942902222 DF 16JUL2019:00:00:00
10987 943965758 E 23JUL2019:00:00:00
10987 944352184 1S 25JUL2019:00:00:00
;
proc sql;
create table want as
select a.*
from have a ,(select ban,min(COL_ACTV_DATE) as d from have where COL_ACTV_CODE='DF' group by ban) b
where a.ban=b.ban and a.COL_ACTV_DATE>=d;
quit;
this is clever! I ran this and found that my data has duplications for DF. But I would want to keep the DF step with the latest sequence number. The other thing is that i got some duplicate outputs in my result for some reason. I will go with the data step solution but thank you for this. i have never thought about joining a table back on itself...
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.