Hello I have an text string that has the address with city, state and zip code. I can extract the state, zip code, city from the end of the string, but I can not figure out how to get the address from the beginning of the string. I can not get the tranwrd function to work using a concatenation of the string from what I have extracted from the string so far. Is there an elegant way to extract the first part of the string that contains the address.
here is a sample of what the string looks like:
address_string
46 Siwash Creek Rd Tonasket Wa 98855
Po Box 124 Ford Wa 99013
Po Box 292 Ford Wa 99013
Po Box 124 Welpinit Wa 99040
Po Box 47 Wellpinit Wa 99040
Po Box 267 Wellpinit Wa 99040
Po Box 276 Wellpinit Wa 99040
5443 Ford Wellpinit Rd Wellpinit Wa 99040
Po Box 188 Wellpinit Wa 99040
Po Box 416 Wellpinit Wa 99040
5962a Martha Boardman Rd Wellpinit Wa 99040
Po Box 35 Wellpinit Wa 99040
2915 Number One Canyon Rd Wenatchee Wa 98801-2462
785 Skidmore Rd Colville Wa 99114
You might be able to use something like:
data have;
format string $char70.;
input;
string=_infile_;
cards;
46 Siwash Creek Rd Tonasket Wa 98855
Po Box 124 Ford Wa 99013
Po Box 292 Ford Wa 99013
Po Box 124 Welpinit Wa 99040
Po Box 47 Wellpinit Wa 99040
Po Box 267 Wellpinit Wa 99040
Po Box 276 Wellpinit Wa 99040
5443 Ford Wellpinit Rd Wellpinit Wa 99040
Po Box 188 Wellpinit Wa 99040
Po Box 416 Wellpinit Wa 99040
5962a Martha Boardman Rd Wellpinit Wa 99040
Po Box 35 Wellpinit Wa 99040
2915 Number One Canyon Rd Wenatchee Wa 98801-2462
785 Skidmore Rd Colville Wa 99114
;
data want (keep=address zip state);
set have;
zip=scan(string,-1,' ');
state=scan(string,-2,' ');
call scan(string,-2,x,y,' ');
address=substrn(string,1,x-1);
run;
You might be able to use something like:
data have;
format string $char70.;
input;
string=_infile_;
cards;
46 Siwash Creek Rd Tonasket Wa 98855
Po Box 124 Ford Wa 99013
Po Box 292 Ford Wa 99013
Po Box 124 Welpinit Wa 99040
Po Box 47 Wellpinit Wa 99040
Po Box 267 Wellpinit Wa 99040
Po Box 276 Wellpinit Wa 99040
5443 Ford Wellpinit Rd Wellpinit Wa 99040
Po Box 188 Wellpinit Wa 99040
Po Box 416 Wellpinit Wa 99040
5962a Martha Boardman Rd Wellpinit Wa 99040
Po Box 35 Wellpinit Wa 99040
2915 Number One Canyon Rd Wenatchee Wa 98801-2462
785 Skidmore Rd Colville Wa 99114
;
data want (keep=address zip state);
set have;
zip=scan(string,-1,' ');
state=scan(string,-2,' ');
call scan(string,-2,x,y,' ');
address=substrn(string,1,x-1);
run;
Thanks arthur, this is really slick. I had to modify a little for cities with two words, but this is exactly what I wanted to do.
Now I need to figure out why it works and how it works.
The following code separates all the words (depending on the length of the address you might have to change the code, perhaps a macro will help you).
Here is the code
data want1;
set have;
j=index(string," ");
add1=substr(string,1,j-1);
restadd=substr(string,j+1);
j = index(restadd," ");
add2=substr(restadd,1,j-1);
restadd=substr(restadd,j+1);
j = index(restadd," ");
add3=substr(restadd,1,j-1);
restadd=substr(restadd,j+1);
j = index(restadd," ");
add4=substr(restadd,1,j-1);
restadd=substr(restadd,j+1);
j = index(restadd," ");
add5=substr(restadd,1,j-1);
restadd=substr(restadd,j+1);
j = index(restadd," ");
add6=substr(restadd,1,j-1);
restadd=substr(restadd,j+1);
j = index(restadd," ");
add7=substr(restadd,1,j-1);
restadd=substr(restadd,j+1);
drop j restadd;
run;
output
string | add1 | add2 | add3 | add4 | add5 | add6 | add7 |
46 Siwash Creek Rd Tonasket Wa 98855 | 46 | Siwash | Creek | Rd | Tonasket | Wa | 98855 |
Po Box 124 Ford Wa 99013 | Po | Box | 124 | Ford | Wa | 99013 | 99013 |
Po Box 292 Ford Wa 99013 | Po | Box | 292 | Ford | Wa | 99013 | 99013 |
Po Box 124 Welpinit Wa 99040 | Po | Box | 124 | Welpinit | Wa | 99040 | 99040 |
Po Box 47 Wellpinit Wa 99040 | Po | Box | 47 | Wellpinit | Wa | 99040 | 99040 |
Po Box 267 Wellpinit Wa 99040 | Po | Box | 267 | Wellpinit | Wa | 99040 | 99040 |
Po Box 276 Wellpinit Wa 99040 | Po | Box | 276 | Wellpinit | Wa | 99040 | 99040 |
5443 Ford Wellpinit Rd Wellpinit Wa 99040 | 5443 | Ford | Wellpinit | Rd | Wellpinit | Wa | 99040 |
Po Box 188 Wellpinit Wa 99040 | Po | Box | 188 | Wellpinit | Wa | 99040 | 99040 |
Po Box 416 Wellpinit Wa 99040 | Po | Box | 416 | Wellpinit | Wa | 99040 | 99040 |
5962a Martha Boardman Rd Wellpinit Wa 99040 | 5962a | Martha | Boardman | Rd | Wellpinit | Wa | 99040 |
Po Box 35 Wellpinit Wa 99040 | Po | Box | 35 | Wellpinit | Wa | 99040 | 99040 |
2915 Number One Canyon Rd Wenatchee Wa 98801-2462 | 2915 | Number | One | Canyon | Rd | Wenatchee | Wa |
785 Skidmore Rd Colville Wa 99114 | 785 | Skidmore | Rd | Colville | Wa | 99114 | 99114 |
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.