Hi,
Base SAS 9.4 user;
I'm a first time user of prxmatch but I'm having issues getting it to work in my datastep.
I have 2 variables set up in a dataset and I want to cross reference all the words in one variable with all the words in the other variable to see if there were any matches. Here is an example of my code:
data xxx ;
x="Drug: Pimavanserin tartrate (ACP-103)" ;
y="Drug: pimavanserin tartrate" ;
output ;
x="Drug: masitinib" ;
y="Drug: masitinib (AB1010)" ;
output ;
run ;
What I want now it to see if any of the words in y appear in x, I decided to see if PRXMATCH would work, so I've created a pattern (which hopefully contains all the pearl stuff and the words to match 'm/word1|word2|word3/io')
data yyy ;
set xxx ;
/*create pattern to search for*/
pattern='m/'||lowcase(translate(compress(strip(scan(y,2,":")),,"adsk"),"|"," "))||'/io' ;
parse=prxparse(pattern) ;
/*this is my checking criteria... so eventually if z>0 then there is a match with >= 1 of the keywords*/
z=prxmatch(prxparse(pattern),x) ;
run ;
What I'm finding is that because the regular expression id (from the prxparse) isn't changing it will only find the first match and then nothing else.
Can anyone help?
thanks
Lindsey
The problem is probably that you use the "o" option on your expression, meaning that it will compile only once in the datastep. Try changing "/io" in the end of the expression to just "/i".
What would be the values of Z you would want for the following data?
x | y | z |
pimavanserin tartrate (ACP-103) | pimavanserin tartrate | |
masitinib | masitinib (AB1010) | |
aspirin tablets 100mg | aspirin | |
calcium tartrate | calcium | |
calcium tartrate | tartrate | |
aluminum tartrate | tartrate | |
doxycycline | lexapro | |
hydrochlorothiazide oxide | oxide | |
Ibuprofen tablets 100mg red | tab | |
one two three four | two three |
I know this is very imperfect but a colleague is finding the minimum matches and I'm finding the maximum so we know the true answer will lie somewhere in between
x | y | z |
pimavanserin tartrate (ACP-103) | pimavanserin tartrate | 1 |
masitinib | masitinib (AB1010) | 1 |
aspirin tablets 100mg | aspirin | 1 |
calcium tartrate | calcium | 1 |
calcium tartrate | tartrate | 1 |
aluminum tartrate | tartrate | 1 |
doxycycline | lexapro | 0 |
hydrochlorothiazide oxide | oxide | 1 |
Ibuprofen tablets 100mg red | tab | 0 |
one two three four | two three | 1 |
The problem is probably that you use the "o" option on your expression, meaning that it will compile only once in the datastep. Try changing "/io" in the end of the expression to just "/i".
BINGO! thank you so much!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.