- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I'm using prxmatch to match the string "END OF LIFE" but I don't want that string to match if it is followed by " CARE". However, I can't seem to get the regular expression right (see below). I could find lots of examples for excluding single characters but not for strings.
The regular expression should not match on "END OF LIFE CARE", but should match on "END OF LIFE MEDICATIONS" and "END OF LIFE OPTION" and "NEAR END OF LIFE OPTION"
data out;
set in;
if prxmatch('/(END OF LIFE[^( CARE)])/i',strip(string)) then output;
run;
Thanks.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
HI @Ryanb2 Please try this-
data have;
input str $50.;
cards;
CARE
END OF LIFE CARE
END OF LIFE MEDICATIONS
END OF LIFE OPTION
NEAR END OF LIFE OPTION
END OF LIFE CARE jbjbj
jgjhbjhb
END OF LIFE
;
data want;
set have;
if prxmatch('/^END OF LIFE$|END OF LIFE\s\b(?:(?!CARE)\w)+\b/', strip(str));
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data have;
input str $50.;
cards;
END OF LIFE CARE
END OF LIFE MEDICATIONS
END OF LIFE OPTION
NEAR END OF LIFE OPTION
;
data want;
set have;
if prxmatch('/END OF LIFE\s\b(?:(?!CARE)\w)+\b/', str);
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Yes! That works for the criteria I listed. I forgot to include one more though, "END OF LIFE" by itself without any characters or space before or after. This should match as well but doesn't using the regular expression you wrote. Can this be modified to include this as well?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
HI @Ryanb2 Please try this-
data have;
input str $50.;
cards;
CARE
END OF LIFE CARE
END OF LIFE MEDICATIONS
END OF LIFE OPTION
NEAR END OF LIFE OPTION
END OF LIFE CARE jbjbj
jgjhbjhb
END OF LIFE
;
data want;
set have;
if prxmatch('/^END OF LIFE$|END OF LIFE\s\b(?:(?!CARE)\w)+\b/', strip(str));
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
That worked! Thank you.
It's frustrating the solution is not simpler. It seems logical and practical that strings to exclude could be contained in parentheses like in my example code. It would make it a lot easier to add more strings to exclude if testing showed more were needed.
Much appreciated.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi @Ryanb2 Personally, I am glad SAS (my only religion) is rather vast and hopefully(complex) for I would always have something new to learn. Believing I have some real youth left and can have reasonably good mental and physical health(including my immediate family) with reasonable economic conditions over the next 25+ years, I would love and want to master ins and outs of SAS in its entirety. Peace!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi
You can use find() function to achieve this:
data HAVE;
infile datalines truncover;
input STRING $500.;
datalines;
END OF LIFE CARE
END OF LIFE MEDICATIONS
END OF LIFE OPTION
NEAR END OF LIFE OPTION
SOMETHING ELSE
;
data WANT;
set HAVE;
if find(string,'END OF LIFE') and not find(string,'END OF LIFE CARE');
run;
Hope this helps.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for your response. I considered doing something like this, but I simplified my code greatly in this post. I need to match on far more strings and for many more fields in a SQL statement. I will likely need to exclude more strings as well. If I was working in a data step I might write something like this and then process all the columns in an array, but I'm looking for a regular expression solution.