Hi Guys,
I have this data-set and i need to select observation depending upon the occurrence of the event/switch.
data have;
infile cards missover;
input id flag switch $8.;
cards;
001 0
001 0
001 0
001 1 first
001 0
001 0
001 1 second
001 0
001 0
002 0
002 1 first
002 0
002 0
002 1 second
002 1 third
;
run;
/*I need three outputs based on switches i,e.*/
/**/
/*data first will looks like:*/
data first;
input id flag switch $8.;
cards;
001 0
001 0
001 0
001 1 first
002 0
002 1 first
;
run;
/*this is selecting observation from first to second*/
data second;
input id flag switch $8.;
cards;
001 1 first
001 0
001 0
001 1 second
002 1 first
002 0
002 0
002 1 second
;
run;
data third;
input id flag switch $8.;
cards;
002 1 second
002 1 third
;
run;
Hi @SASFREAK,
Try this:
data first second third;
do _n_=1 by 1 until(last.id);
set have;
by id;
if switch='first' then n1=_n_;
else if switch='second' then n2=_n_;
else if switch='third' then n3=_n_;
end;
do _n_=1 by 1 until(last.id);
set have;
by id;
if _n_<=n1 then output first;
if n1<=_n_<=n2 then output second;
if n2<=_n_<=n3 then output third;
end;
drop n1-n3;
run;
Hi @SASFREAK,
Try this:
data first second third;
do _n_=1 by 1 until(last.id);
set have;
by id;
if switch='first' then n1=_n_;
else if switch='second' then n2=_n_;
else if switch='third' then n3=_n_;
end;
do _n_=1 by 1 until(last.id);
set have;
by id;
if _n_<=n1 then output first;
if n1<=_n_<=n2 then output second;
if n2<=_n_<=n3 then output third;
end;
drop n1-n3;
run;
data want;
set have;
by id;
if switch='first' then seq='1';
else if switch='second' then seq='2';
else if switch='third' then seq='3';
else seq='4';
run;
data want2 first second third;
length new $20;
do until(last.id);
set want;
by id;
retain new;
if first.id then new=seq;
else new=cats(new,seq);
end;
i=0;
do until(last.id);
set want;
by id;
i+1;
new1=substr(new,i,length(new));
if prxmatch('/14/',new1) then
output first;
if prxmatch('/2[3|4]/',new1) then
output second;
if prxmatch('/(^23\s)|(^3\s)/',new1) then
output third;
output want2;
end;
drop new new1;
run;
I think you can do this while reading the data set only once:
data first second third;
set have;
by id;
length destination $ 6;
if first.id then destination = 'first';
if destination='first' then output first;
else if destination='second' then output second;
else output third;
if switch > ' ' then destination = switch;
retain destination;
run;
Sigh ... it's always nicer when a program yields the correct solution. Here's another try:
data first second third;
set have;
by id;
length destination $ 6;
if first.id then destination = 'first';
if destination='first' then output first;
else if destination='second' then output second;
else output third;
if switch = 'first' then destination = 'second';
else if switch='second' then destination='third';
retain destination;
run;
@Astounding wrote:
I think you can do this while reading the data set only once:
That would improve efficiency indeed, but I think with the current requirements it's a bit difficult (though possible): When you arrive at a record with SWITCH='second' you don't know yet whether there will be a record with SWITCH='third' in the same BY group. So, until you know, you'd have to write the current and subsequent observations to a temporary location (possibly a hash object). Then, if SWITCH='third', dump them into dataset THIRD, otherwise discard them.
You're right. I ignored the possibility of discarding records. The problem even has an example of that ...
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.