Hello,
I have a dataset looks like below, but I have a hard time select only the data between the first B and the second B. Anyone have any idea how to achieve that?
Original Dataset (example):
AcctNum Date Type
1 20130104 P
1 20130118 P
1 20130124 B
1 20130131 P
1 20130214 P
1 20130221 B
2 20130109 B
2 20130111 P
2 20130206 B
3 20130114 B
3 20130214 B
. . .
. . .
. . .
Dataset want to achieve
1 20130124 B
1 20130131 P
1 20130214 P
2 20130109 B
2 20130111 P
3 20130114 B
Thanks in advance!
Regards,
Frank
I'm not sure if there is a function, but you can easily just create the variables and logic. e.g.:
data want (drop=firstone keepit);
set example;
by acctnum;
retain keepit firstone;
if first.acctnum then do;
firstone=1;
keepit=0;
end;
if firstone and type eq "B" then do;
keepit=1;
firstone=0;
end;
else if keepit and type eq "B" and not firstone then keepit=0;
if keepit then output;
run;
I'm not sure if there is a function, but you can easily just create the variables and logic. e.g.:
data want (drop=firstone keepit);
set example;
by acctnum;
retain keepit firstone;
if first.acctnum then do;
firstone=1;
keepit=0;
end;
if firstone and type eq "B" then do;
keepit=1;
firstone=0;
end;
else if keepit and type eq "B" and not firstone then keepit=0;
if keepit then output;
run;
Thanks Arthur. It works perfect. Awesome logic!
Depending on what to do if you only have one 'B' per group, so you may need some tweaks on 's code, which now pulls one 'B's. You will need another pass to make it happen. Here is another hash based solution which does not pull one 'B's.
data have;
input AcctNum $ Date :yymmdd8. Type$;
format date yymmdd8.;
cards;
1 20130104 P
1 20130118 P
1 20130124 B
1 20130131 P
1 20130214 P
1 20130221 B
2 20130109 B
2 20130111 P
2 20130206 B
3 20130114 B
3 20130214 B
4 20130214 B
5 20130104 B
5 20130118 P
5 20130124 B
5 20130131 P
5 20130214 P
5 20130221 B
;
data want;
if _n_=1 then do;
if 0 then set have;
declare hash h(multidata:'y');
h.definekey('acctnum');
h.definedata('acctnum','date','type');
h.definedone();
declare hiter hi('h');
end;
set have;
by acctnum;
if first.acctnum then do;
rc=h.clear();
call missing(_i);
end;
if type='B' then _i+1;
if _i=1 then rc=h.add();
if last.acctnum and _i>1 then do;
rc=hi.first();
do rc=0 by 0 while (rc=0);
output;
rc=hi.next();
end;
end;
drop rc _i;
run;
Haikuo
Thanks for your input, HaiKuo. The question now becomes more interesting.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.