Hi,
Base SAS 9.4 user;
I'm a first time user of prxmatch but I'm having issues getting it to work in my datastep.
I have 2 variables set up in a dataset and I want to cross reference all the words in one variable with all the words in the other variable to see if there were any matches. Here is an example of my code:
data xxx ;
x="Drug: Pimavanserin tartrate (ACP-103)" ;
y="Drug: pimavanserin tartrate" ;
output ;
x="Drug: masitinib" ;
y="Drug: masitinib (AB1010)" ;
output ;
run ;
What I want now it to see if any of the words in y appear in x, I decided to see if PRXMATCH would work, so I've created a pattern (which hopefully contains all the pearl stuff and the words to match 'm/word1|word2|word3/io')
data yyy ;
set xxx ;
/*create pattern to search for*/
pattern='m/'||lowcase(translate(compress(strip(scan(y,2,":")),,"adsk"),"|"," "))||'/io' ;
parse=prxparse(pattern) ;
/*this is my checking criteria... so eventually if z>0 then there is a match with >= 1 of the keywords*/
z=prxmatch(prxparse(pattern),x) ;
run ;
What I'm finding is that because the regular expression id (from the prxparse) isn't changing it will only find the first match and then nothing else.
Can anyone help?
thanks
Lindsey
The problem is probably that you use the "o" option on your expression, meaning that it will compile only once in the datastep. Try changing "/io" in the end of the expression to just "/i".
What would be the values of Z you would want for the following data?
x | y | z |
pimavanserin tartrate (ACP-103) | pimavanserin tartrate | |
masitinib | masitinib (AB1010) | |
aspirin tablets 100mg | aspirin | |
calcium tartrate | calcium | |
calcium tartrate | tartrate | |
aluminum tartrate | tartrate | |
doxycycline | lexapro | |
hydrochlorothiazide oxide | oxide | |
Ibuprofen tablets 100mg red | tab | |
one two three four | two three |
I know this is very imperfect but a colleague is finding the minimum matches and I'm finding the maximum so we know the true answer will lie somewhere in between
x | y | z |
pimavanserin tartrate (ACP-103) | pimavanserin tartrate | 1 |
masitinib | masitinib (AB1010) | 1 |
aspirin tablets 100mg | aspirin | 1 |
calcium tartrate | calcium | 1 |
calcium tartrate | tartrate | 1 |
aluminum tartrate | tartrate | 1 |
doxycycline | lexapro | 0 |
hydrochlorothiazide oxide | oxide | 1 |
Ibuprofen tablets 100mg red | tab | 0 |
one two three four | two three | 1 |
The problem is probably that you use the "o" option on your expression, meaning that it will compile only once in the datastep. Try changing "/io" in the end of the expression to just "/i".
BINGO! thank you so much!
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.