Select observations with a certain pattern

Solved
Occasional Contributor
Posts: 13

Select observations with a certain pattern

Hi everyone,

I'd like to select only id 001 in this data set, in which it has the med values go in pattern of A B A? Any help would be greatly appreciated.

data a;

input id \$ date:mmddyy10. med \$;

cards;

001 1/1/2014 A

001 2/1/2014 B

001 3/1/2014 A

002 1/1/2014 A

002 2/1/2014 A

002 3/1/2014 B

002 4/1/2014 B

003 1/1/2013 A

003 2/1/2014 B

004 1/1/2014 A

;

Accepted Solutions
Solution
‎05-03-2014 11:32 PM
Posts: 5,523

Re: Select observations with a certain pattern

I added a pattern counter (pat) to separate multiple patterns within an id :

data a;

input id \$ date:mmddyy10. med \$;

cards;

001 1/1/2014 A

001 2/1/2014 B

001 3/1/2014 A

002 1/1/2014 A

002 2/1/2014 A

002 3/1/2014 B

002 4/1/2014 B

003 1/1/2013 A

003 2/1/2014 B

004 1/1/2014 A

;

data pat;

set a;

if id = lag1(id) and id = lag2(id) and med = "A" and lag1(med) = "B" and lag2(med) = "A" then do;

pat + 1;

do point = _n_-2 to _n_;

set a point=point;

output;

end;

end;

drop point;

run;

proc print data=pat; run;

PG

PG

All Replies
PROC Star
Posts: 8,163

Re: Select observations with a certain pattern

define what you mean by pattern.  Would A A B A count?  Or what about A B B A?

If it is that the first three records for an id have to be A, then B, then A then you could use something like:

data want (drop=check counter);

do until (last.id);

set a;

by id;

if first.id then do;

check=0;

counter=1;

end;

else counter+1;

if counter in (1,3) and med eq 'A' then check+1;

else if counter eq 2 and med eq 'B' then check+1;

end;

do until (last.id);

set a;

by id;

if check eq 3 then output;

end;

run;

Occasional Contributor
Posts: 13

Re: Select observations with a certain pattern

Hi Arthur and PGStats - Thanks very much for your help. I was able to tweak the code a little to work with my real larger dataset. Thanks again!

Solution
‎05-03-2014 11:32 PM
Posts: 5,523

Re: Select observations with a certain pattern

I added a pattern counter (pat) to separate multiple patterns within an id :

data a;

input id \$ date:mmddyy10. med \$;

cards;

001 1/1/2014 A

001 2/1/2014 B

001 3/1/2014 A

002 1/1/2014 A

002 2/1/2014 A

002 3/1/2014 B

002 4/1/2014 B

003 1/1/2013 A

003 2/1/2014 B

004 1/1/2014 A

;

data pat;

set a;

if id = lag1(id) and id = lag2(id) and med = "A" and lag1(med) = "B" and lag2(med) = "A" then do;

pat + 1;

do point = _n_-2 to _n_;

set a point=point;

output;

end;

end;

drop point;

run;

proc print data=pat; run;

PG

PG
🔒 This topic is solved and locked.