- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Base SAS 9.4 user;
I'm a first time user of prxmatch but I'm having issues getting it to work in my datastep.
I have 2 variables set up in a dataset and I want to cross reference all the words in one variable with all the words in the other variable to see if there were any matches. Here is an example of my code:
data xxx ;
x="Drug: Pimavanserin tartrate (ACP-103)" ;
y="Drug: pimavanserin tartrate" ;
output ;
x="Drug: masitinib" ;
y="Drug: masitinib (AB1010)" ;
output ;
run ;
What I want now it to see if any of the words in y appear in x, I decided to see if PRXMATCH would work, so I've created a pattern (which hopefully contains all the pearl stuff and the words to match 'm/word1|word2|word3/io')
data yyy ;
set xxx ;
/*create pattern to search for*/
pattern='m/'||lowcase(translate(compress(strip(scan(y,2,":")),,"adsk"),"|"," "))||'/io' ;
parse=prxparse(pattern) ;
/*this is my checking criteria... so eventually if z>0 then there is a match with >= 1 of the keywords*/
z=prxmatch(prxparse(pattern),x) ;
run ;
What I'm finding is that because the regular expression id (from the prxparse) isn't changing it will only find the first match and then nothing else.
Can anyone help?
thanks
Lindsey
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The problem is probably that you use the "o" option on your expression, meaning that it will compile only once in the datastep. Try changing "/io" in the end of the expression to just "/i".
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
What would be the values of Z you would want for the following data?
x | y | z |
pimavanserin tartrate (ACP-103) | pimavanserin tartrate | |
masitinib | masitinib (AB1010) | |
aspirin tablets 100mg | aspirin | |
calcium tartrate | calcium | |
calcium tartrate | tartrate | |
aluminum tartrate | tartrate | |
doxycycline | lexapro | |
hydrochlorothiazide oxide | oxide | |
Ibuprofen tablets 100mg red | tab | |
one two three four | two three |
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I know this is very imperfect but a colleague is finding the minimum matches and I'm finding the maximum so we know the true answer will lie somewhere in between
x | y | z |
pimavanserin tartrate (ACP-103) | pimavanserin tartrate | 1 |
masitinib | masitinib (AB1010) | 1 |
aspirin tablets 100mg | aspirin | 1 |
calcium tartrate | calcium | 1 |
calcium tartrate | tartrate | 1 |
aluminum tartrate | tartrate | 1 |
doxycycline | lexapro | 0 |
hydrochlorothiazide oxide | oxide | 1 |
Ibuprofen tablets 100mg red | tab | 0 |
one two three four | two three | 1 |
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The problem is probably that you use the "o" option on your expression, meaning that it will compile only once in the datastep. Try changing "/io" in the end of the expression to just "/i".
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
BINGO! thank you so much!