Hello,
I have a dataset looks like below, but I have a hard time select only the data between the first B and the second B. Anyone have any idea how to achieve that?
Original Dataset (example):
AcctNum Date Type
1 20130104 P
1 20130118 P
1 20130124 B
1 20130131 P
1 20130214 P
1 20130221 B
2 20130109 B
2 20130111 P
2 20130206 B
3 20130114 B
3 20130214 B
. . .
. . .
. . .
Dataset want to achieve
1 20130124 B
1 20130131 P
1 20130214 P
2 20130109 B
2 20130111 P
3 20130114 B
Thanks in advance!
Regards,
Frank
I'm not sure if there is a function, but you can easily just create the variables and logic. e.g.:
data want (drop=firstone keepit);
set example;
by acctnum;
retain keepit firstone;
if first.acctnum then do;
firstone=1;
keepit=0;
end;
if firstone and type eq "B" then do;
keepit=1;
firstone=0;
end;
else if keepit and type eq "B" and not firstone then keepit=0;
if keepit then output;
run;
I'm not sure if there is a function, but you can easily just create the variables and logic. e.g.:
data want (drop=firstone keepit);
set example;
by acctnum;
retain keepit firstone;
if first.acctnum then do;
firstone=1;
keepit=0;
end;
if firstone and type eq "B" then do;
keepit=1;
firstone=0;
end;
else if keepit and type eq "B" and not firstone then keepit=0;
if keepit then output;
run;
Thanks Arthur. It works perfect. Awesome logic!
Depending on what to do if you only have one 'B' per group, so you may need some tweaks on 's code, which now pulls one 'B's. You will need another pass to make it happen. Here is another hash based solution which does not pull one 'B's.
data have;
input AcctNum $ Date :yymmdd8. Type$;
format date yymmdd8.;
cards;
1 20130104 P
1 20130118 P
1 20130124 B
1 20130131 P
1 20130214 P
1 20130221 B
2 20130109 B
2 20130111 P
2 20130206 B
3 20130114 B
3 20130214 B
4 20130214 B
5 20130104 B
5 20130118 P
5 20130124 B
5 20130131 P
5 20130214 P
5 20130221 B
;
data want;
if _n_=1 then do;
if 0 then set have;
declare hash h(multidata:'y');
h.definekey('acctnum');
h.definedata('acctnum','date','type');
h.definedone();
declare hiter hi('h');
end;
set have;
by acctnum;
if first.acctnum then do;
rc=h.clear();
call missing(_i);
end;
if type='B' then _i+1;
if _i=1 then rc=h.add();
if last.acctnum and _i>1 then do;
rc=hi.first();
do rc=0 by 0 while (rc=0);
output;
rc=hi.next();
end;
end;
drop rc _i;
run;
Haikuo
Thanks for your input, HaiKuo. The question now becomes more interesting.
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.