DATA Step, Macro, Functions and more

finding matches for a long list of patterns

Accepted Solution Solved
Reply
Regular Contributor
Posts: 209
Accepted Solution

finding matches for a long list of patterns

I have a long list of patterns that I need to check to see if there are matches in my data.
Which function do you recommend that would be the quickest to create code for checking for a long list of patterns?
Types of patterns I'm interested in matching are:
122Y[digit 0-9]P68A
12811[digit 0-9][digit 0-9]E
13559[digit 0-9][digit 0-9]
65G101[digit 0-9][digit 0-9]
F6364[digit 0-9][digit 0-9]-66[letter A-Z]

 

Thank you.


Accepted Solutions
Solution
‎05-08-2018 04:10 PM
Super User
Posts: 23,771

Re: finding matches for a long list of patterns


All Replies
Solution
‎05-08-2018 04:10 PM
Super User
Posts: 23,771

Re: finding matches for a long list of patterns

Regular Contributor
Posts: 209

Re: finding matches for a long list of patterns

Thank you. This macro seems to work well.

Do you have any more advice with regards to using regex functions with a long list of patterns?

In addition to creating a long list of patterns to check, the list of values I'm checking for these patterns is >700,000.

Any advice for improving efficiency would be appreciated.

%macro prx(pattern);
b=prxparse("&pattern");
if prxmatch(b,serial_number)>0 then check=1;
%mend;

%prx(/^2C1522[\d][\d]$/);

 

Thank you.

Super User
Posts: 23,771

Re: finding matches for a long list of patterns

I'm not that familiar with regex so I would suggest reposting this specifically and asking for efficiency, if you already have not. 

 

There is a way to speed it up so that SAS is not resolving the prx every time but I don't know enough to illustrate it. 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 3 replies
  • 142 views
  • 0 likes
  • 2 in conversation