BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
fgillen1
Fluorite | Level 6

 

Hi,

 

I am doing an address matching exercise where I want to match on part of an address being within another address. Is below correct?

 

I want to find t2.address_line_3_clean within the concatenation.

 

 cats(t1.address1, t1.address2, t1.address3) LIKE "%" || t2.address_line_3_clean || "%"

Regards

Finbar

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Also you likely do not want to use CATS as the last "word" of address1 will run into the first "word" of address2 creating a string that is less likely to be "like" your target. Please see this example and a suggested alternate:

data example;
   length address1 - address3 $ 25;
   address1 = 'Apartment 456  ';
   address2 = '123 Main St    ';
   address3 = 'SomeTown       ';
   resultcats = cats(address1,address2,address3);
   resultcatx = catx(' ',address1,address2,address3);
   put resultcats= resultcatx=;
run;

And really, provide the entire code as LIKE is only valid in Proc Sql/fedsql and so may have other issues, plus at least one example of values that this doesn't work for.

 

 

You may also want to construct a variable with the contents of "%" || t2.address_line_3_clean || "%"

to see exactly what you are comparing too as there is a good chance when using the || operator that your comparison string has a number of trailing spaces. From the documentation:

Note: When you are using the % character, be aware of the effect of trailing blanks. You might have to use the TRIM function to remove trailing blanks in order to match values.

View solution in original post

3 REPLIES 3
ScottBass
Rhodochrosite | Level 12
Use Perl Regular Expression. See doc for details.

Please post your question as a self-contained data step in the form of "have" (source) and "want" (desired results).
I won't contribute to your post if I can't cut-and-paste your syntactically correct code into SAS.
ballardw
Super User

Also you likely do not want to use CATS as the last "word" of address1 will run into the first "word" of address2 creating a string that is less likely to be "like" your target. Please see this example and a suggested alternate:

data example;
   length address1 - address3 $ 25;
   address1 = 'Apartment 456  ';
   address2 = '123 Main St    ';
   address3 = 'SomeTown       ';
   resultcats = cats(address1,address2,address3);
   resultcatx = catx(' ',address1,address2,address3);
   put resultcats= resultcatx=;
run;

And really, provide the entire code as LIKE is only valid in Proc Sql/fedsql and so may have other issues, plus at least one example of values that this doesn't work for.

 

 

You may also want to construct a variable with the contents of "%" || t2.address_line_3_clean || "%"

to see exactly what you are comparing too as there is a good chance when using the || operator that your comparison string has a number of trailing spaces. From the documentation:

Note: When you are using the % character, be aware of the effect of trailing blanks. You might have to use the TRIM function to remove trailing blanks in order to match values.

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

Creating Custom Steps in SAS Studio

Check out this tutorial series to learn how to build your own steps in SAS Studio.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 2522 views
  • 3 likes
  • 4 in conversation