BookmarkSubscribeRSS Feed
missmeliss22
Calcite | Level 5
I have a table in SAS with ~15,000 peoples' business name and PO Box with their ID number. I need to match on the PO Box in an Excel sheet, but in the SAS table PO Box is written all sorts of ways i.e., P O Box, P.O. Box POBox, P. Box, etc. I found a SUGI paper where someone used Perl reg expressions to reformat all to the same P O Box format. I used this code setting my SAS table where in the paper they used a datalines with just a few lines of code. What I wrote works (tried it for 6 ID numbers); however, it is SO slow. It took 5 minutes to match 6 and I need to do this for 15,000...is there a better faster way to do this? Please help! What I have is


Data pobox_fix;
Set POBox;
Pochg=prxchange("s/P?\s*\.*\s*O?\s*\.*\s*BOX\s*\.*\s*/P O BOX /i",-1,address);
Run;

I'm new to these Perl reg expressions but I have read several papers at this point.
2 REPLIES 2
missmeliss22
Calcite | Level 5
** found reference on page 9 of the NESUG paper at www.lexjansen.com/nesug/nesug12/bb/bb17.pdf
PGStats
Opal | Level 21

The time consuming part must be elsewhere unless address is a very very long string. There is no particular reason for this expression to be slow.

PG

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 1128 views
  • 0 likes
  • 2 in conversation