Hi,
I am doing an address matching exercise where I want to match on part of an address being within another address. Is below correct?
I want to find t2.address_line_3_clean within the concatenation.
cats(t1.address1, t1.address2, t1.address3) LIKE "%" || t2.address_line_3_clean || "%"
Regards
Finbar
Also you likely do not want to use CATS as the last "word" of address1 will run into the first "word" of address2 creating a string that is less likely to be "like" your target. Please see this example and a suggested alternate:
data example; length address1 - address3 $ 25; address1 = 'Apartment 456 '; address2 = '123 Main St '; address3 = 'SomeTown '; resultcats = cats(address1,address2,address3); resultcatx = catx(' ',address1,address2,address3); put resultcats= resultcatx=; run;
And really, provide the entire code as LIKE is only valid in Proc Sql/fedsql and so may have other issues, plus at least one example of values that this doesn't work for.
You may also want to construct a variable with the contents of "%" || t2.address_line_3_clean || "%"
to see exactly what you are comparing too as there is a good chance when using the || operator that your comparison string has a number of trailing spaces. From the documentation:
Note: When you are using the % character, be aware of the effect of trailing blanks. You might have to use the TRIM function to remove trailing blanks in order to match values.
Also you likely do not want to use CATS as the last "word" of address1 will run into the first "word" of address2 creating a string that is less likely to be "like" your target. Please see this example and a suggested alternate:
data example; length address1 - address3 $ 25; address1 = 'Apartment 456 '; address2 = '123 Main St '; address3 = 'SomeTown '; resultcats = cats(address1,address2,address3); resultcatx = catx(' ',address1,address2,address3); put resultcats= resultcatx=; run;
And really, provide the entire code as LIKE is only valid in Proc Sql/fedsql and so may have other issues, plus at least one example of values that this doesn't work for.
You may also want to construct a variable with the contents of "%" || t2.address_line_3_clean || "%"
to see exactly what you are comparing too as there is a good chance when using the || operator that your comparison string has a number of trailing spaces. From the documentation:
Note: When you are using the % character, be aware of the effect of trailing blanks. You might have to use the TRIM function to remove trailing blanks in order to match values.
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.