Hi Guys,
I have this data-set and i need to select observation depending upon the occurrence of the event/switch.
data have;
infile cards missover;
input id flag switch $8.;
cards;
001 0
001 0
001 0
001 1 first
001 0
001 0
001 1 second
001 0
001 0
002 0
002 1 first
002 0
002 0
002 1 second
002 1 third
;
run;
/*I need three outputs based on switches i,e.*/
/**/
/*data first will looks like:*/
data first;
input id flag switch $8.;
cards;
001 0
001 0
001 0
001 1 first
002 0
002 1 first
;
run;
/*this is selecting observation from first to second*/
data second;
input id flag switch $8.;
cards;
001 1 first
001 0
001 0
001 1 second
002 1 first
002 0
002 0
002 1 second
;
run;
data third;
input id flag switch $8.;
cards;
002 1 second
002 1 third
;
run;
Hi @SASFREAK,
Try this:
data first second third;
do _n_=1 by 1 until(last.id);
set have;
by id;
if switch='first' then n1=_n_;
else if switch='second' then n2=_n_;
else if switch='third' then n3=_n_;
end;
do _n_=1 by 1 until(last.id);
set have;
by id;
if _n_<=n1 then output first;
if n1<=_n_<=n2 then output second;
if n2<=_n_<=n3 then output third;
end;
drop n1-n3;
run;
Hi @SASFREAK,
Try this:
data first second third;
do _n_=1 by 1 until(last.id);
set have;
by id;
if switch='first' then n1=_n_;
else if switch='second' then n2=_n_;
else if switch='third' then n3=_n_;
end;
do _n_=1 by 1 until(last.id);
set have;
by id;
if _n_<=n1 then output first;
if n1<=_n_<=n2 then output second;
if n2<=_n_<=n3 then output third;
end;
drop n1-n3;
run;
data want;
set have;
by id;
if switch='first' then seq='1';
else if switch='second' then seq='2';
else if switch='third' then seq='3';
else seq='4';
run;
data want2 first second third;
length new $20;
do until(last.id);
set want;
by id;
retain new;
if first.id then new=seq;
else new=cats(new,seq);
end;
i=0;
do until(last.id);
set want;
by id;
i+1;
new1=substr(new,i,length(new));
if prxmatch('/14/',new1) then
output first;
if prxmatch('/2[3|4]/',new1) then
output second;
if prxmatch('/(^23\s)|(^3\s)/',new1) then
output third;
output want2;
end;
drop new new1;
run;
I think you can do this while reading the data set only once:
data first second third;
set have;
by id;
length destination $ 6;
if first.id then destination = 'first';
if destination='first' then output first;
else if destination='second' then output second;
else output third;
if switch > ' ' then destination = switch;
retain destination;
run;
Sigh ... it's always nicer when a program yields the correct solution. Here's another try:
data first second third;
set have;
by id;
length destination $ 6;
if first.id then destination = 'first';
if destination='first' then output first;
else if destination='second' then output second;
else output third;
if switch = 'first' then destination = 'second';
else if switch='second' then destination='third';
retain destination;
run;
@Astounding wrote:
I think you can do this while reading the data set only once:
That would improve efficiency indeed, but I think with the current requirements it's a bit difficult (though possible): When you arrive at a record with SWITCH='second' you don't know yet whether there will be a record with SWITCH='third' in the same BY group. So, until you know, you'd have to write the current and subsequent observations to a temporary location (possibly a hash object). Then, if SWITCH='third', dump them into dataset THIRD, otherwise discard them.
You're right. I ignored the possibility of discarding records. The problem even has an example of that ...
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.