Hi there
My input data is below, I would like to change Rd to Road, AVE to Avenue, CL to close etc this is part of it I have lots of weird address as well.
I have tried propcase(tranwrd(ADDR,'Cl','Close')) as ADDR. But this only does one name per time. Is there any chance I can update Rd, AVE and CL in one data step?
Input
Name | Addr |
Amy | 12 Stream Rd |
Mary | 1/891 Mason AVE |
John | 821 Stream Road |
Ivy | 67 Tork CL |
Martin | 5/891 Mason Avenue |
Expectation output
Name | Address |
Amy | 12 Stream Road |
Mary | 1/891 Mason Avenue |
John | 821 Stream Road |
Ivy | 67 Tork Close |
Martin | 5/891 Mason Avenue |
Many thanks!
This code works
data have;
Retain Name Address;
length Address $ 100;
input Name $ _Addr $ 100.;
infile datalines dsd dlm=',' truncover;
Address=tranwrd(tranwrd(tranwrd(_Addr,'Rd', "Road"),'AVE', "Avenue"),'CL',"Close");
drop _Addr;
datalines;
Amy ,2 Stream Rd
Mary,1/891 Mason AVE
John,821 Stream Road
Ivy,67 Tork CL
Martin ,5/891 Mason Avenue
;
run;
The output will be as you wanted
I don't understand the problem: "I have tried propcase(tranwrd(ADDR,'Cl','Close')) as ADDR. But this only does one name per time." - Please post the code you have used along with the log. So that we what happens actually.
For data cleaning i would always use a data step, the code is easier to debug. One problem could be caused by the length of the variable addr. Is it long enough to hold the additional chars?
A data step could look this:
data want;
set have;
length addr_expanded $ 100;
addr_expanded = tranwrd(addr, 'Rd', 'Road');
addr_expanded = tranwrd(addr_expanded, 'AVE', 'Avenue');
addr_expanded = tranwrd(addr_expanded, 'CL', Close');
run;
You could write many TRANWRD() for this in a data step. Like : data want; set have; ADDR= propcase(prxchange('s/\bCl\b/Close/,-1,ADDR')); ADDR= propcase(prxchange('s/\bRd\b/Road/,-1,ADDR')); ADDR= propcase(prxchange('s/\bAVE\b/Avenue/,-1,ADDR')); run;
This code works
data have;
Retain Name Address;
length Address $ 100;
input Name $ _Addr $ 100.;
infile datalines dsd dlm=',' truncover;
Address=tranwrd(tranwrd(tranwrd(_Addr,'Rd', "Road"),'AVE', "Avenue"),'CL',"Close");
drop _Addr;
datalines;
Amy ,2 Stream Rd
Mary,1/891 Mason AVE
John,821 Stream Road
Ivy,67 Tork CL
Martin ,5/891 Mason Avenue
;
run;
The output will be as you wanted
datalines;
Amy ,2 Stream St
Mary,1/891 Mason Street
John,821 Stream Road
Ivy,67 Tork CL
Martin ,5/891 Mason Avenue
;
I have one row that is St and another is Street. Applying the code tranwrd(_Addr,'St', "Street"). The result turn up to be Streetreet. Any chance of fixing that?
Amy 2 Streetream Street
Mary 1/891 Mason Streetreet
if you run the tranwrd twice you will get Streetreet
I have update the code to account for the presence of St in the address.
data have;
Retain Name Address;
length Address $ 100;
input Name $ _Addr $ 100.;
infile datalines dsd dlm=',' truncover;
Address=tranwrd(tranwrd(tranwrd(tranwrd(_Addr,'Rd', "Road"),'AVE', "Avenue"),'CL',"Close"),' St '," Street");
drop _Addr;
datalines;
Samy ,2 Stream St
Amy ,2 Stream Rd
Mary,1/891 Mason AVE
John,821 Stream Road
Ivy,67 Tork CL
Martin ,5/891 Mason Avenue
;
run;
The output will be like as shown below. It is what you want.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.