- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I am using the prxmatch function to search for words in a variable. However, I got results that are not perfect match. In particular, I was trying to search for the word "tree", but I also got "street". How do I get a perfect match, i.e., to exclude results that have words that include words that I am actually searching for?
Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
It's hard to say without knowing exactly what you're searching for or what you're data might have, but for the example you provided something like prxmatch('/\btre\b/i', my_text) should fix the issue. The \b signifies a word boundary. If you just had a space, you would not find any "tree" values where "tree" starts the string.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the reply. Below is the code that I had written and trying to find the cases that broken tree is involved:
data spct.tree;
set treedata;
if prxmatch("m/trees|limbs|branches/oi", combined_description) > 0 then tree=1;
else tree=0;
run;
However, like I said, the results include cases that show something like "a man walking on the street" because "tree" is part of the word street. Since I have multiple words, as you can see, that I put in to search that I think is related to tree, how or where do I add the "\b" option so to take care of the problem? Thanks and truly appreciate your help.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Again, this might not do everything you want depending on your data, but this is how you'd incorporate the word boundary into your code:
if prxmatch("m/\btrees\b|\blimbs\b|\bbranches\b/oi", combined_description) > 0 then tree=1;
Note that you can also get the 0, 1 boolean you want by using:
tree = prxmatch(....) > 0;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
To match singular and plural words, you could use "m/\b(trees?|limbs?|branch(es)?)\b/oi"
\b means word boundary
? means match zero or one occurence
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content