BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Scott86
Obsidian | Level 7

Hi everyone,

 

I have a long list of addresses and I need to extract just the street name:

 

Dummy data set:

 

Addresses (column name)

1000 Ngapenga rd

25 Gill Lane

po box 234

174/H Mangatin drive

102b te hono st

Te pahu rd

162 No 2 rd

 

I want the extract street name to look like

Street_name:

Ngapenga

Gill

Mangatin

Te Hono

Tepahu

No 2

 

My code is currently below: 

 

set customer_addy;
x = anydigit(addresses,1);
if x = 1 then street_name = substr(addresses,2,length(scan(addresses,2, ' ')));
run;

 

I cannot get my head around how to taken into account all the many conditions. Any help is appreciated.

 

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions
Shmuel
Garnet | Level 18

The function to replace a word is: TRANWRD (not transword).

 

To multiple replacements, you can do:

 

address = addresses;
address = tranwrd(upcase(address), ' ST', ' ');
address = tranwrd(upcase(address), ' DR', ' ');
address = compbl(address);
 

I have added a space before the 'ST', 'DR' - to eliminate replacement in case those are substrings 

(think of EASTERN, ANDRE)

 

View solution in original post

6 REPLIES 6
Shmuel
Garnet | Level 18

You may try use translate in order to replace numers into space, and

use tranword to replace constants - like ' rd ', ' st ', ' road ', ' street ', ' lane ', etc.  - into spaces,

being aware of lowcase/uppercase, than use compbl the result and check

is ther more to do.

andreas_lds
Jade | Level 19

Write down every rule you want to apply to the variable Addresses, then start coding.

 

Maybe deleting the unwanted content is easier than extracting the required information, the last line of your example give that approach additional complexity.

 

Regular Expression seem to be the best way to extract the street names.

SASKiwi
PROC Star

What is your final objective with cleaning address data? Is it by chance anything to do with address matching? If so there are tools and services available that cleanse, standardise and match addresses to a much higher level of quality than you are ever likely to achieve yourself.

 

Your addresses look like New Zealand ones. There are tools available with NZ address localisation that can do what you require without any coding, for example SAS's Dataflux. 

gamotte
Rhodochrosite | Level 12

Hello,

 

If you have at your disposal a comprehensive list of possible street names, you can use it to match your list of adresses.

Scott86
Obsidian | Level 7

 

How do I use a transwrd function for multiple conditions.

 

address = transwrd(upcase(addresses), 'ST', ' ');
address = transwrd(upcase(addresses), 'DR', ' ');

 

This code only takes the last entry. If I create multiple variables i.e. address1, address2 then I have to different varables which I need in 1 column.

 

Any help is appreciated

Thanks

 

Shmuel
Garnet | Level 18

The function to replace a word is: TRANWRD (not transword).

 

To multiple replacements, you can do:

 

address = addresses;
address = tranwrd(upcase(address), ' ST', ' ');
address = tranwrd(upcase(address), ' DR', ' ');
address = compbl(address);
 

I have added a space before the 'ST', 'DR' - to eliminate replacement in case those are substrings 

(think of EASTERN, ANDRE)

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1011 views
  • 2 likes
  • 5 in conversation