BookmarkSubscribeRSS Feed
missmeliss22
Calcite | Level 5
I have a table in SAS with ~15,000 peoples' business name and PO Box with their ID number. I need to match on the PO Box in an Excel sheet, but in the SAS table PO Box is written all sorts of ways i.e., P O Box, P.O. Box POBox, P. Box, etc. I found a SUGI paper where someone used Perl reg expressions to reformat all to the same P O Box format. I used this code setting my SAS table where in the paper they used a datalines with just a few lines of code. What I wrote works (tried it for 6 ID numbers); however, it is SO slow. It took 5 minutes to match 6 and I need to do this for 15,000...is there a better faster way to do this? Please help! What I have is


Data pobox_fix;
Set POBox;
Pochg=prxchange("s/P?\s*\.*\s*O?\s*\.*\s*BOX\s*\.*\s*/P O BOX /i",-1,address);
Run;

I'm new to these Perl reg expressions but I have read several papers at this point.
2 REPLIES 2
missmeliss22
Calcite | Level 5
** found reference on page 9 of the NESUG paper at www.lexjansen.com/nesug/nesug12/bb/bb17.pdf
PGStats
Opal | Level 21

The time consuming part must be elsewhere unless address is a very very long string. There is no particular reason for this expression to be slow.

PG

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 1105 views
  • 0 likes
  • 2 in conversation