BookmarkSubscribeRSS Feed
missmeliss22
Calcite | Level 5
I have a table in SAS with ~15,000 peoples' business name and PO Box with their ID number. I need to match on the PO Box in an Excel sheet, but in the SAS table PO Box is written all sorts of ways i.e., P O Box, P.O. Box POBox, P. Box, etc. I found a SUGI paper where someone used Perl reg expressions to reformat all to the same P O Box format. I used this code setting my SAS table where in the paper they used a datalines with just a few lines of code. What I wrote works (tried it for 6 ID numbers); however, it is SO slow. It took 5 minutes to match 6 and I need to do this for 15,000...is there a better faster way to do this? Please help! What I have is


Data pobox_fix;
Set POBox;
Pochg=prxchange("s/P?\s*\.*\s*O?\s*\.*\s*BOX\s*\.*\s*/P O BOX /i",-1,address);
Run;

I'm new to these Perl reg expressions but I have read several papers at this point.
2 REPLIES 2
missmeliss22
Calcite | Level 5
** found reference on page 9 of the NESUG paper at www.lexjansen.com/nesug/nesug12/bb/bb17.pdf
PGStats
Opal | Level 21

The time consuming part must be elsewhere unless address is a very very long string. There is no particular reason for this expression to be slow.

PG

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 679 views
  • 0 likes
  • 2 in conversation