Hi there
My input data is below, I would like to change Rd to Road, AVE to Avenue, CL to close etc this is part of it I have lots of weird address as well.
I have tried propcase(tranwrd(ADDR,'Cl','Close')) as ADDR. But this only does one name per time. Is there any chance I can update Rd, AVE and CL in one data step?
Input
| Name | Addr |
| Amy | 12 Stream Rd |
| Mary | 1/891 Mason AVE |
| John | 821 Stream Road |
| Ivy | 67 Tork CL |
| Martin | 5/891 Mason Avenue |
Expectation output
| Name | Address |
| Amy | 12 Stream Road |
| Mary | 1/891 Mason Avenue |
| John | 821 Stream Road |
| Ivy | 67 Tork Close |
| Martin | 5/891 Mason Avenue |
Many thanks!
This code works
data have;
Retain Name Address;
length Address $ 100;
input Name $ _Addr $ 100.;
infile datalines dsd dlm=',' truncover;
Address=tranwrd(tranwrd(tranwrd(_Addr,'Rd', "Road"),'AVE', "Avenue"),'CL',"Close");
drop _Addr;
datalines;
Amy ,2 Stream Rd
Mary,1/891 Mason AVE
John,821 Stream Road
Ivy,67 Tork CL
Martin ,5/891 Mason Avenue
;
run;
The output will be as you wanted
I don't understand the problem: "I have tried propcase(tranwrd(ADDR,'Cl','Close')) as ADDR. But this only does one name per time." - Please post the code you have used along with the log. So that we what happens actually.
For data cleaning i would always use a data step, the code is easier to debug. One problem could be caused by the length of the variable addr. Is it long enough to hold the additional chars?
A data step could look this:
data want;
set have;
length addr_expanded $ 100;
addr_expanded = tranwrd(addr, 'Rd', 'Road');
addr_expanded = tranwrd(addr_expanded, 'AVE', 'Avenue');
addr_expanded = tranwrd(addr_expanded, 'CL', Close');
run;
You could write many TRANWRD() for this in a data step. Like :
data want;
set have;
ADDR= propcase(prxchange('s/\bCl\b/Close/,-1,ADDR'));
ADDR= propcase(prxchange('s/\bRd\b/Road/,-1,ADDR'));
ADDR= propcase(prxchange('s/\bAVE\b/Avenue/,-1,ADDR'));
run;
This code works
data have;
Retain Name Address;
length Address $ 100;
input Name $ _Addr $ 100.;
infile datalines dsd dlm=',' truncover;
Address=tranwrd(tranwrd(tranwrd(_Addr,'Rd', "Road"),'AVE', "Avenue"),'CL',"Close");
drop _Addr;
datalines;
Amy ,2 Stream Rd
Mary,1/891 Mason AVE
John,821 Stream Road
Ivy,67 Tork CL
Martin ,5/891 Mason Avenue
;
run;
The output will be as you wanted
datalines;
Amy ,2 Stream St
Mary,1/891 Mason Street
John,821 Stream Road
Ivy,67 Tork CL
Martin ,5/891 Mason Avenue
;
I have one row that is St and another is Street. Applying the code tranwrd(_Addr,'St', "Street"). The result turn up to be Streetreet. Any chance of fixing that?
Amy 2 Streetream Street
Mary 1/891 Mason Streetreet
if you run the tranwrd twice you will get Streetreet
I have update the code to account for the presence of St in the address.
data have;
Retain Name Address;
length Address $ 100;
input Name $ _Addr $ 100.;
infile datalines dsd dlm=',' truncover;
Address=tranwrd(tranwrd(tranwrd(tranwrd(_Addr,'Rd', "Road"),'AVE', "Avenue"),'CL',"Close"),' St '," Street");
drop _Addr;
datalines;
Samy ,2 Stream St
Amy ,2 Stream Rd
Mary,1/891 Mason AVE
John,821 Stream Road
Ivy,67 Tork CL
Martin ,5/891 Mason Avenue
;
run;
The output will be like as shown below. It is what you want.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.