BookmarkSubscribeRSS Feed
gzr2mz39
Quartz | Level 8

In addition to having a long list of patterns (over 50) to check using regex, I need to check these patterns against more than 700,000 observations.

Does anyone have any advice for improving efficiency?

Here's the macro I'm using to accomplish this task:

%macro prx(pattern,serial);
b=prxparse("&pattern");
if prxmatch(b,serial_number)>0 then do;
check=1;
serial=&serial;
if (length(serial) = length(serial_number)) then check=2;
end;
%mend;

Thank you.

4 REPLIES 4
ChrisNZ
Tourmaline | Level 20

The first things that comes to mind, without knowing more:

- can use use functions like index() or similar, they a lot cheaper to use than RegEx?

- can you use else if  to avoid searching once a pattern is matched?

 

This may possibly be cheaper too:

if prxmatch("&pattern",serial_number)>0 then do;

PGStats
Opal | Level 21

Make sure your pattern uses the "o" suffix, as in "/abc[a-c]+/o", as it signals to the compiler that the pattern is a constant that only needs to be compiled once.

PG
ChrisNZ
Tourmaline | Level 20

@PGStats 

My understanding was that SAS used the o suffix by default in recent (9.4 ?) versions of SAS if the RegEx string was a constant. 

I can't find a source though, so maybe am I mistaken.

 

Update: I did a quick test, this runs the same with and without the o.

data _null_;
 do I=1 to 1e7; 
   R=prxmatch('/\d\w\d/o',cat(I));
 end;
run;
Patrick
Opal | Level 21

As others already wrote: Certainly use ELSE and use functions like find() or index() where possible.

If leading and trailing blanks are not important then use STRIP() as well: prxmatch(<regex>,strip(<variable>))

And last but not least: Tweak your RegEx; especially the one's applied on long strings - ie Greedy vs. Lazy

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 968 views
  • 3 likes
  • 4 in conversation