BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Scott86
Obsidian | Level 7

Hi everyone,

 

I have a long list of addresses and I need to extract just the street name:

 

Dummy data set:

 

Addresses (column name)

1000 Ngapenga rd

25 Gill Lane

po box 234

174/H Mangatin drive

102b te hono st

Te pahu rd

162 No 2 rd

 

I want the extract street name to look like

Street_name:

Ngapenga

Gill

Mangatin

Te Hono

Tepahu

No 2

 

My code is currently below: 

 

set customer_addy;
x = anydigit(addresses,1);
if x = 1 then street_name = substr(addresses,2,length(scan(addresses,2, ' ')));
run;

 

I cannot get my head around how to taken into account all the many conditions. Any help is appreciated.

 

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions
Shmuel
Garnet | Level 18

The function to replace a word is: TRANWRD (not transword).

 

To multiple replacements, you can do:

 

address = addresses;
address = tranwrd(upcase(address), ' ST', ' ');
address = tranwrd(upcase(address), ' DR', ' ');
address = compbl(address);
 

I have added a space before the 'ST', 'DR' - to eliminate replacement in case those are substrings 

(think of EASTERN, ANDRE)

 

View solution in original post

6 REPLIES 6
Shmuel
Garnet | Level 18

You may try use translate in order to replace numers into space, and

use tranword to replace constants - like ' rd ', ' st ', ' road ', ' street ', ' lane ', etc.  - into spaces,

being aware of lowcase/uppercase, than use compbl the result and check

is ther more to do.

andreas_lds
Jade | Level 19

Write down every rule you want to apply to the variable Addresses, then start coding.

 

Maybe deleting the unwanted content is easier than extracting the required information, the last line of your example give that approach additional complexity.

 

Regular Expression seem to be the best way to extract the street names.

SASKiwi
PROC Star

What is your final objective with cleaning address data? Is it by chance anything to do with address matching? If so there are tools and services available that cleanse, standardise and match addresses to a much higher level of quality than you are ever likely to achieve yourself.

 

Your addresses look like New Zealand ones. There are tools available with NZ address localisation that can do what you require without any coding, for example SAS's Dataflux. 

gamotte
Rhodochrosite | Level 12

Hello,

 

If you have at your disposal a comprehensive list of possible street names, you can use it to match your list of adresses.

Scott86
Obsidian | Level 7

 

How do I use a transwrd function for multiple conditions.

 

address = transwrd(upcase(addresses), 'ST', ' ');
address = transwrd(upcase(addresses), 'DR', ' ');

 

This code only takes the last entry. If I create multiple variables i.e. address1, address2 then I have to different varables which I need in 1 column.

 

Any help is appreciated

Thanks

 

Shmuel
Garnet | Level 18

The function to replace a word is: TRANWRD (not transword).

 

To multiple replacements, you can do:

 

address = addresses;
address = tranwrd(upcase(address), ' ST', ' ');
address = tranwrd(upcase(address), ' DR', ' ');
address = compbl(address);
 

I have added a space before the 'ST', 'DR' - to eliminate replacement in case those are substrings 

(think of EASTERN, ANDRE)

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 2009 views
  • 2 likes
  • 5 in conversation