Hi,
I am doing an address matching exercise where I want to match on part of an address being within another address. Is below correct?
I want to find t2.address_line_3_clean within the concatenation.
cats(t1.address1, t1.address2, t1.address3) LIKE "%" || t2.address_line_3_clean || "%"
Regards
Finbar
Also you likely do not want to use CATS as the last "word" of address1 will run into the first "word" of address2 creating a string that is less likely to be "like" your target. Please see this example and a suggested alternate:
data example; length address1 - address3 $ 25; address1 = 'Apartment 456 '; address2 = '123 Main St '; address3 = 'SomeTown '; resultcats = cats(address1,address2,address3); resultcatx = catx(' ',address1,address2,address3); put resultcats= resultcatx=; run;
And really, provide the entire code as LIKE is only valid in Proc Sql/fedsql and so may have other issues, plus at least one example of values that this doesn't work for.
You may also want to construct a variable with the contents of "%" || t2.address_line_3_clean || "%"
to see exactly what you are comparing too as there is a good chance when using the || operator that your comparison string has a number of trailing spaces. From the documentation:
Note: When you are using the % character, be aware of the effect of trailing blanks. You might have to use the TRIM function to remove trailing blanks in order to match values.
Also you likely do not want to use CATS as the last "word" of address1 will run into the first "word" of address2 creating a string that is less likely to be "like" your target. Please see this example and a suggested alternate:
data example; length address1 - address3 $ 25; address1 = 'Apartment 456 '; address2 = '123 Main St '; address3 = 'SomeTown '; resultcats = cats(address1,address2,address3); resultcatx = catx(' ',address1,address2,address3); put resultcats= resultcatx=; run;
And really, provide the entire code as LIKE is only valid in Proc Sql/fedsql and so may have other issues, plus at least one example of values that this doesn't work for.
You may also want to construct a variable with the contents of "%" || t2.address_line_3_clean || "%"
to see exactly what you are comparing too as there is a good chance when using the || operator that your comparison string has a number of trailing spaces. From the documentation:
Note: When you are using the % character, be aware of the effect of trailing blanks. You might have to use the TRIM function to remove trailing blanks in order to match values.
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.