BookmarkSubscribeRSS Feed
Gayatrikunchay
Obsidian | Level 7

Hi,

 

I have a SAS dataset containing addresses like the following example:

addr 1addr 2addr 3
20 kelly dr apt 2 england home  
16 hawk lane fl 2 bertha senior living senior living
3 india road  
45 virginia street apt 6 england home  
holden st unit 3unit 3 

 

I need all the apt/unit numbers in the 2nd column and any additional level of address in the 3rd column like the following:

Is there any way to do this? Thank you!

 

addr 1addr 2addr 3
20 kelly drapt 2england home
16 hawk lanefl 2bertha senior living
3 india road  
45 virginia streetapt 6england home
holden stunit 3

 

3 REPLIES 3
PaigeMiller
Diamond | Level 26

Seems like a difficult task. The difficult task is to create a comprehensive set of rules that will separate this data the way you want. Once you have those rules, the coding isn't that hard.

 

For example, are the first three "words" always the address 1? What if someone lives at 523 West Main Street? That's four "words" for address 1. Is that possible? If so, how would you know where the address 1 ends?

 

Since you know the data, and we do not, you need to create the rules to do this and then inform us of those rules.

--
Paige Miller
SASKiwi
PROC Star

How did all of the address data end up in ADDR1 to begin with? Is that how it really is in the data source or what happened after importing into SAS? If it is the latter then fix the import. As @PaigeMiller says, address parsing is hard. There is specialist functionality in SAS Data Quality to handle this but I doubt you would have this product.available to you. In any case these don't look like standard street addresses.

 

How many addresses do you have? If it is only a few thousand then a custom solution might work. 

Ksharp
Super User

I think you'd better post your real data to explain your question.

 

data have;
input have $80.;
cards;	
20 kelly dr apt 2 england home	 	 
16 hawk lane fl 2 bertha senior living	 	
3 india road	 	 
45 virginia street apt 6 england home	 	 
holden st unit 3
;

data want;
 set have;
 p=prxmatch('/\w+\s+\d+/io',have);
 if p then do;
  addr1=substr(have,1,p-1);
  temp=substr(have,p);
  call scan(temp,2,p2,l2,' ');
  addr2=substr(temp,1,p2+l2);
  addr3=substr(temp,p2+l2);
 end;
 else   addr1=have;
 keep addr1 addr2 addr3;
 run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 946 views
  • 0 likes
  • 4 in conversation