I am new to SAS EG and I am not familiar with the programming side, only the point and click interface.
I have one column titled 'Name', and another column titled 'Name and Address' (sample attached). I need to determine if the name is contained in the name and address column.
What would you suggest is the best way to perform this action? In the query builder I tried creating a computed column with the contains operator and a second attempt with the scan function (which I thought was closest to a fuzzy lookup) but those did not work for me.
Any thoughts or guidance is much appreciated! Thanks in advance!
Another issue that may apply is the presence of trailing blanks. It is possible that the first comparison locates the NAME, when the second one doesn't:
location = index(name_plus_address, strip(name));
location = index(name_plus_address, name);
The STRIP function removes leading and trailing blanks. The second statement (omitting STRIP), might be searching for a longer string that includes the trailing blanks ... probably less than ideal.
This is not a fuzzy lookup, you're looking for FIND or INDEX to search for a particular string instead. This assumes that comparisons are along a single row. If you need to compare the name from row 2 with the address/name from Row 1 that's a different problem.
This would be a calculated/computed column with the query builder, under Advanced Calculations most likely. You can use the INDEX or FIND function within the computed column builder.
@RandyW wrote:
I am new to SAS EG and I am not familiar with the programming side, only the point and click interface.
I have one column titled 'Name', and another column titled 'Name and Address' (sample attached). I need to determine if the name is contained in the name and address column.
What would you suggest is the best way to perform this action? In the query builder I tried creating a computed column with the contains operator and a second attempt with the scan function (which I thought was closest to a fuzzy lookup) but those did not work for me.
Any thoughts or guidance is much appreciated! Thanks in advance!
Another issue that may apply is the presence of trailing blanks. It is possible that the first comparison locates the NAME, when the second one doesn't:
location = index(name_plus_address, strip(name));
location = index(name_plus_address, name);
The STRIP function removes leading and trailing blanks. The second statement (omitting STRIP), might be searching for a longer string that includes the trailing blanks ... probably less than ideal.
Thanks to all! The following worked best for my needs but the other suggestions were good options too!
location = index(name_plus_address, strip(name));
If you have mixed case then use FIND(Name_address,Name,'i')
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Check out this tutorial series to learn how to build your own steps in SAS Studio.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.