DATA Step, Macro, Functions and more

Perl regular expression HELP

Reply
Occasional Contributor
Posts: 13

Perl regular expression HELP

I have a table in SAS with ~15,000 peoples' business name and PO Box with their ID number. I need to match on the PO Box in an Excel sheet, but in the SAS table PO Box is written all sorts of ways i.e., P O Box, P.O. Box POBox, P. Box, etc. I found a SUGI paper where someone used Perl reg expressions to reformat all to the same P O Box format. I used this code setting my SAS table where in the paper they used a datalines with just a few lines of code. What I wrote works (tried it for 6 ID numbers); however, it is SO slow. It took 5 minutes to match 6 and I need to do this for 15,000...is there a better faster way to do this? Please help! What I have is


Data pobox_fix;
Set POBox;
Pochg=prxchange("s/P?\s*\.*\s*O?\s*\.*\s*BOX\s*\.*\s*/P O BOX /i",-1,address);
Run;

I'm new to these Perl reg expressions but I have read several papers at this point.
Occasional Contributor
Posts: 13

Re: Perl regular expression HELP

Posted in reply to missmeliss22
** found reference on page 9 of the NESUG paper at www.lexjansen.com/nesug/nesug12/bb/bb17.pdf
Esteemed Advisor
Posts: 5,625

Re: Perl regular expression HELP

Posted in reply to missmeliss22

The time consuming part must be elsewhere unless address is a very very long string. There is no particular reason for this expression to be slow.

PG
Ask a Question
Discussion stats
  • 2 replies
  • 122 views
  • 0 likes
  • 2 in conversation