Hi all! I am in the process in completing physical address cleanup. One of the issues I want to clean is removing the city name in the address line. I have created mock data, via datalines, to illustrate example data in a table called "Address_Mock". I have a table called "ZipCodes_Clean" which contains all U.S. states and territories zip codes with all the possible cities attributed to those zip codes in columns named City1 to City130. The following is an example from the table:
Zipcode State Country WorldRegion City1 City2 City3 City4 City5 City6 City7 City8 City9 City10 33172 FL US NA MIAMI DORAL DORAL BRANCH SWEETWATER WEST MIAMI
I am using two SAS functions (FINDW and TRANWRD) in a conditional IF-THEN statement to locate if the city is in the address line, create new column containing the address line with the city replaced by blank space. However, the final result has been blank result in the new column.
Below is the mock code of how I am currently trying to resolve the issue:
data Address_Mock;
infile datalines dsd;
input Name: $8. Address_Line: $26. ZipCode ;
datalines;
MICHELLE,"10597 NW 7 TER MIAMI",33172
NICOLE,"10597 NW 7 TER DORAL",33172
MARIA,"10597 NW 7 TER WEST MIAMI",33172
ANTONIO,"10597 NW 7 TER SWEETWATER",33172
;
run;
proc sql;
create table Address_Join as
select t1.*, t2.*
from Address_Mock as t1 left join ZipCodes_Clean as t2 on t1.ZipCode=t2.ZipCode;
quit;
data Address_Loop;
set Address_Join;
array cols {*} city1-city130 ;
do i=1 to 130;
if FINDW(Address_Line,cols{i}) then New_Address_Line=TRANWRD(Address_Line,cols{i},"");
end;
run;
I'll be very appreciative if anyone sees the error in the code or I'm open to trying alternative code. Thanks in advance!
... View more