@jahanm wrote:
Unfortunately the structure_id variable is human-entered freetext so I actually do run into the problem you stated. There are some variables "GTVp" that I also want to eliminate, and the findw code earlier doesn't seem to filter variants like that.
Regular expressions allow you to search for text patterns - you just need to be able to formulate the rules for these patterns.
Below RegEx will match any string with bowel or gtv in it. But using a RegEx would also allow you to search for a pattern like a word consisting of GTV plus max. one additional letter.
data have;
input study_id structure_id : $13.;
cards;
1 BLADDER
2 .
3 BOWEL
4 BOWEL_SMALL
5 GTV
6 GTV_small
;
data want;
set have;
if prxmatch('/bowel|gtv/oi',structure_id)>0 then delete;
run;
... View more