Hello SAS community:
I have one table named 'table1' with a column named 'X1' and a second table, 'table2', with a column named 'X2'. I want to combine a row from 'table1' and 'table2' when the value of 'X1' from 'table1' is a 'close' match to the value of 'X2' in 'table2'.
For example, if the value of 'X1' from a row in 'table1' is the string 'abc.&.123', and the value of 'X2' from a row in 'table2' is 'abc_and_123', then I would like to return a row which is the combined rows from 'table1' and 'table2' on this 'close' match between variables 'X1' from 'table1' and 'X2' from 'table2'.
I can use the LIKE operator to locate each row from each table as follows:
/*FOR table1:*/ PROC SQL; SELECT X1 FROM table1 WHERE X1 LIKE 'abc%123'; QUIT; /*FOR table2:*/ PROC SQL; SELECT X2 FROM table2 WHERE X2 LIKE 'abc%123'; QUIT;
Basically I would like to do a JOIN for the two tables on a 'close' match. I wonder if the LIKE operator can achieve this, or perhaps there is another strategy that would allow me to do the same?
Thanks.
this is a link to a paper that seems to have solved my problem. thanks again to @PaigeMiller and @SASKiwi for their helpful feedback and commentary.
Search this forum (and google the rest of the internet) for "fuzzy matching".
thanks for the tip @PaigeMiller
You haven't posted much detail on the types of fuzzy matching you want. The only example you have given looks like it is made-up. It would be helpful if you could provide more examples. Typically DATA steps are much better for fuzzy matching.
thanks for your comment @SASKiwi
this is a link to a paper that seems to have solved my problem. thanks again to @PaigeMiller and @SASKiwi for their helpful feedback and commentary.
@Varrelle - No problem, glad to hear you have figured it out for yourself. Please bear in mind that using a range of techniques is best for fuzzy matching. Start by doing exact matches, then use fuzzy techniques on the remaining data.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.